Inference with EMCEE

Config Changes Relative To Classical Inference

The major changes are:

Under the 'inference' section add method: emcee entry, and
Under the 'statistics' section move the resample specific configuration under a 'resample' subsection as show bellow:

In addition to those configuration changes there are now new likelihood statistics offered: pois, norm/norm_homoskedastic, norm_cov/norm_heteroskedastic, nbinom, rmse, absolute_error. As well as new regularizations: forecast and allsubpops.

Running Locally

You can test your updated config by running:

flepimop-calibrate -c config_emcee.yml --nwalkers 5  --jobs 5 --niterations 10 --nsamples 5 --id my_run_id

If it works, it should produce:

Plots of simulation directly from your config,
Plots after the fits with the fits and the parameter chains,
An h5 file with all the chains, and
The usual model_output/ directory.

It will also immediately produce standard out that is similar to (dependent on config):

  gempyor >> Running ***DETERMINISTIC*** simulation;
  gempyor >> ModelInfo USA_inference_all; index: 1; run_id: SMH_Rdisparity_phase_one_phase1_blk1_fixprojnpis_CA-NC_emcee,
  gempyor >> prefix: USA_inference_all/SMH_Rdisparity_phase_one_phase1_blk1_fixprojnpis_CA-NC_emcee/;
Loaded subpops in loaded relative probablity file: 51 Intersect with seir simulation:  2 kept
Running Gempyor Inference

LogLoss: 6 statistics and 92 data points,number of NA for each statistic: 
incidD_latino    46
incidD_other      0
incidD_asian      0
incidD_black      0
incidD_white      0
incidC_white     24
incidC_black     24
incidC_other     24
incidC_asian     24
incidC_latino    61
incidC           24
incidD            0
dtype: int64
InferenceParameters: with 92 parameters: 
    seir_modifiers: 84 parameters
    outcome_modifiers: 8 parameters

Here, it says the config fits 92 parameters, we'll keep that in mind and choose a number of walkers greater than (ideally 2 times) this number of parameters.

Running On An HPC Environment With Slurm

First, install flepiMoP on the cluster following the Running On A HPC With Slurm guide. Then manually create a batch file to submit to slurm like so:

#!/bin/bash
#SBATCH --ntasks 1
#SBATCH --nodes 1
#SBATCH --mem 450g
#SBATCH --cpus-per-task 256
#SBATCH --time 20:00:00
flepimop-calibrate --config config_NC_emcee.yml \
  --nwalkers 500  \
  --jobs 256 \
  --niterations 2000 \
  --nsamples 250 \
  --id my_id  > out_fit256.out 2>&1

Breaking down what each of these lines does:

#SBATCH --ntasks 1: Requests that this be run as a single job,
#SBATCH --nodes 1: Requests that the job be run on 1 node, as of right now EMCEE only supports single nodes,
#SBATCH --mem 450g: Requests that the whole job get 405GB of memory should be ~2-3GB per a walker,
#SBATCH --cpus-per-task 256: Requests that the whole job get 256 CPUs (technically 256 per a task by ntasks should be set to 1 for EMCEE),
#SBATCH --time 20:00:00: Specifies a time limit of 20hrs for this job to complete in, and
flepimop-calibrate ...:
- --config config_NC_emcee.yml: Use the config_NC_emcee.yml for this calibration run,
- --nwalkers 500: Use 500 walkers (or chains) for this calibration, should be about 2x the number of parameters,
- --jobs 256: The number of parallel walkers to run, should be either 1x or 0.5x the number of cpus,
- --niterations: The number of iterations to run for for each walker,
- --nsamples: The number of posterier samples (taken from the end of each walker) to save to the model_output/ directory, and
- --id: An optional short but unique job name, if not explicitly provided one will be generated from the config.

For more details on other options provided by gempyor for calibration please see flepimop-calibrate --help.

Postprocessing EMCEE

At this stage postprocessing for EMCEE outputs is fairly manual. A good starting point can be found in postprocessing/emcee_postprocess.ipynb which plots the chains and can run forward projections from the sample drawn from calibration.

PreviousInference Model Output NextSetting up the model and post-processing

Last updated 6 months ago