LogoLogo
JHU-IDDCOVID-19 Scenario Modeling hubCOVID-19 Forecast Hub
  • Home
  • gempyor: modeling infectious disease dynamics
    • Modeling infectious disease dynamics
    • Model Implementation
      • flepiMoP's configuration file
      • Specifying population structure
      • Specifying compartmental model
      • Specifying initial conditions
      • Specifying seeding
      • Specifying observational model
      • Distributions
      • Specifying time-varying parameter modifications
      • Other configuration options
      • Code structure
    • Model Output
  • Model Inference
    • Inference Description
    • Inference Implementation
      • Specifying data source and fitted variables
      • (OLD) Configuration options
      • (OLD) Configuration setup
      • Code structure
    • Inference Model Output
    • Inference with EMCEE
  • More
    • Setting up the model and post-processing
      • Config writer
      • Diagnostic plotting scripts
      • Create a post-processing script
      • Reporting
    • Advanced
      • File descriptions
      • Numerical methods
      • Additional parameter options
      • Swapping model modules
      • Using plug-ins 🧩[experimental]
  • How To Run
    • Quick Start Guide
    • Multiple Configuration Files
    • Synchronizing Files
    • Advanced run guides
      • Running On A HPC With Slurm
      • Running with Docker locally 🛳
      • Running on AWS 🌳
    • Environment Variables
    • Common errors
    • Useful commands
    • Tips, tricks, FAQ
  • Development
    • Installing flepiMoP For Development
    • Git and GitHub Usage
    • Guidelines for contributors
  • Deprecated pages
    • Module specification
  • JHU Internal
    • US specific How to Run
      • Running with Docker locally (outdated/US specific) 🛳
      • Running on Rockfish/MARCC - JHU 🪨🐠
      • Running with docker on AWS - OLD probably outdated
        • Provisioning AWS EC2 instance
        • AWS Submission Instructions: Influenza
        • AWS Submission Instructions: COVID-19
      • Running with RStudio Server on AWS EC2
    • Inference scratch
  • Technical Reference
    • click commands
Powered by GitBook
On this page
Edit on GitHub
Export as PDF
  1. gempyor: modeling infectious disease dynamics
  2. Model Implementation

flepiMoP's configuration file

PreviousModel ImplementationNextSpecifying population structure

Last updated 10 months ago

CtrlK
  • About configuration files
  • Example
  • Notation
  • Configuration files sections
  • Global header
  • subpop_setup section
  • compartments section
  • seir section
  • initial_conditions section
  • seeding section
  • outcomes section
  • interventions section
  • inference section

About configuration files

flepiMop is set up so that all parameters and other options for running the pipeline can be specified in a single "configuration" file (aka "config"). Users do not need to edit any other code files, or even be aware of their contents, to create and run complex model scenarios. Configuration files also provide a convenient record of model options and promote reproducibility of model results.

We use the YAML language syntax to write config files, which are typically named something like config.yml. The file has simple plain text contents and follows a tabbed outline structure. When config files are read by the model code, a data structure encoding the model options is created.

Comments can be added to the config file by starting with the hash key (#) then a space. Comments can start anywhere on a line and continue until the end, but if they run over to a new line, a new # must be used at the start of the new line.

Example

(give a simple configuration for a toy model with two subpopulations, SEIR, single "cases" outcome, single seeded infection, single NPI that starts after some time? this page is currently under development, please see our _for some simple configurations) ;

When referring to config items (individual parameters), we use their full position in the outline. For example, in the sample config file above, we denote

as subpop_setup::geodata having a value of minimal

Notation

Parameters and other options specified in the configuration files can take on a variety of types of values, using the following notations:

  • dates are specified as [year]-[month]-[day]. (e.g., 2020-01-31)

  • boolean values are either "TRUE" or "FALSE"

  • files names are strings

Configuration files sections

Global header

Required section

These global configuration options typically sit at the top of the configuration file.

Item
Required?
Type/Format
Description

For example, for a configuration file to simulate the spread of COVID-19 in the US during 2020 and compare to data from March 1 onwards, with 1000 independent simulations, the header of the config might read:

subpop_setup section

Required section

This section specifies the population structure on which the model will be simulated, including the names and sizes of each subpopulation and the connectivity between them. More details .

compartments section

Required section

This section is where users can specify the variables (infection states) that will be tracked in the infectious disease transmission model. More details can be found . The other details of the model are specified in the seir section, including transitions between these compartments (seir::transitions), the names of the parameters governing the transitions (seir::parameters), and the numerical method used to simulate the equations over time (seir::integration). The initial conditions of the model can be specified in the initial_conditions section, and any other inputs into the model from external populations or instantaneous transitions between states that occur at later times can be specified in the seeding section. ;

seir section

Required section

This section is where users can specify the details of the infectious disease transmission model they wish to simulate (e.g., SEIR). This model describes the allowed transitions (seir::transitions) between the compartments that were specified in the compartments section, the values of the parameters involved in these transitions (seir::parameters), and the numerical method used to simulate the equations over time (seir::integration). More details . The initial conditions of the model can be specified in the separate initial_conditions section, and any other inputs into the model from external populations or instantaneous transitions between states that occur at later times can be specified in the seeding section. ;

initial_conditions section

Optional section

This section is used to specify the initial conditions of the model, which define how individuals are distributed between the model compartments at the time the model simulation begins. Importantly, the initial conditions specify the time and location where infection is first introduced. If this section is omitted, default values are used. If users want to add infections to the population at later times, or add or remove individuals from compartments separately from the model rules, they can do so via the related seeding section. More details ;

seeding section

Optional section

This section is used to specify how individuals are instantaneously "seeded" from one compartment to another, where they then continue to be governed by the model equations. For example, this seeding could be used to represent importations of infected individuals from an outside population, mutation events that create new strains, or vaccinations that alter disease susceptibility. Seeding events can occur at any time in the simulation. The seeding section specifies the numeric values added to or removed from any compartment of the model. More details ;

outcomes section

Optional section

This section is where users can define new variables representing the observed quantities and how they are related to the underlying state variables in the model (e.g., the fraction of infections that are detected as cases). More details ;

interventions section

Required section

This section is where users can specify time-varying changes to parameters governing either the infectious disease model or the observational model. More details ;

inference section

Optional section

This section is where users can specify the details of how the model is fit to data, including what data streams they will be included and which outcome variables they represent and the likelihood functions describing the probability of the data given the model. More details . ;

probability is a float between 0 and 1
  • distribution is a probability distribution from which a random value for the parameter is drawn each time a new simulation is run (or chain, if doing inference). See here for the require schema.

  • optional for non-inference runs, required for inference runs

    date

    end date for comparing model to data

    nslots

    optional (can also be defined by an environmental variable)

    int

    number of independent simulations to run

    setup_name

    optional

    string

    setup name used to describe the run, used in setting up file names

    model_output_dirname

    optional

    folder path

    path to folder where all the outputs created by the model are stored, if not specified, default is model_output

    name

    required

    string

    Name of this configuration. Will be used in file names created to store model output.

    start_date

    required

    date

    model simulation start date

    end_date

    required

    date

    model simulation end date

    start_date_groundtruth

    optional for non-inference runs, required for inference runs

    date

    start date for comparing model to data

    name: USA_covid19_2020
    model_output_dirname: model_output
    start_date: 2020-01-01
    end_date: 2020-12-31
    start_date_groundtruth: 2020-03-01
    end_date_groundtruth: 2020-12-31
    nslots: 1000
    example repo
    subpop_setup:
      ...
      geodata: minimal
    here
    here
    here
    here
    here
    here
    here
    here

    end_date_groundtruth