arrow-left

All pages
gitbookPowered by GitBook
1 of 11

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Model Implementation

Specifying initial conditions

This section describes how to specify the values of each model state at the time the simulation starts, and how to make instantaneous changes to state values at other times (e.g., due to importations)

hashtag
Overview

In order for the models specified previously to be dynamically simulated, the user must provide initial conditions, in addition to the model structure and parameter values. Initial conditions describe the value of each variable in the model at the time point that the simulation is to start. For example, on day zero of an outbreak, we may assume that the entire population is susceptible except for one single infected individual. Alternatively, we could assume that some portion of the population already has prior immunity due to vaccination or previous infection. Different initial conditions lead to different model trajectories.

The initial_conditions section of the configuration file is detailed below. Note that in some cases, can replace or complement the initial condition, the table below provides a quick comparison of these sections.

Feature
initial_conditions
seeding

hashtag
Specifying model initial conditions

The configuration items in the initial_conditions section of the config file are

initial_conditions:method Must be either "Default", "SetInitialConditions", or "FromFile".

initial_conditions:initial_conditions_fileRequired for methods “SetInitialConditions” and “FromFile” . Path to a .csv or .parquet file containing the list of initial conditions for each compartment.

initial_conditions:initial_file_type Only required for method: “FolderDraw”. Description TBA

initial_conditions::allow_missing_subpops Optional for all methods, determines what will happen if initial_conditions_file is missing values for some subpopulations. If FALSE, the default behavior, or unspecified, an error will occur if subpopulations are missing. If TRUE, then for subpopulations missing from the initial_conditions file, it will be assumed that all individuals begin in the first compartment (the “first” compartment depends on how the model was specified, and will be the compartment that contains the first named category in each compartment group), unless another compartment is designated to hold the rest of the individuals ;

initial_conditions::allow_missing_compartments Optional for all methods. If FALSE, the default behavior, or unspecified, an error will occur if any compartments are missing for any subpopulation. If TRUE, then it will be assumed there are zero individuals in compartments missing from the initial_conditions file.

initial_conditions::proportional If TRUE, assume that the user has specified all input initial conditions as fractions of the population, instead of numbers of individuals (the default behavior, or if set to FALSE). Code will check that initial values in all compartments sum to 1.0 and throw an error if not, and then will multiply all values by the total population size for that subpopulation ;

Details on implementing each initial conditions method and the options that go along with it are below.

hashtag
initial_conditions::method

hashtag
Default

The default initial conditions are that the initial value of all compartments for each subpopulation will be zero, except for the first compartment, whose value will be the population size. The “first” compartment depends on how the model was specified, and will be the compartment that contains the first named category in each compartment group.

For example, a model with the following compartments

with the accompanying geodata file

will be started with 1000 individuals in the S_child_unvaxxed in the "small province" and 10,000 in that compartment in the "large province".

hashtag
SetInitialConditions

With this method users can specify arbitrary initial conditions in a convenient formatted input .csv or .parquet file.

For example, for a model with the following compartments and initial_conditions sections

with the accompanying geodata file

where initial_conditions.csv contains

the model will be started with half of the population of both subpopulations, consisting of children and the other half of adults, everyone unvaccinated, and 5 infections (in exposed-but-not-yet-infectious class) among the unvaccinated adults in the large province, with the remaining individuals susceptible (4995). All other compartments will contain zero individuals initially ;

initial_conditions::initial_conditions_file must contain the following columns:

  • subpop – the name of the subpopulation for which the initial condition is being specified. By default, all subpopulations must be listed in this file, unless the allow_missing_subpops option is set to TRUE.

  • mc_name – the concatenated name of the compartment for which an initial condition is being specified. The order of the compartment groups in the name must be the same as the order in which these groups are defined in the config for the model, e.g., you cannot say unvaccinated_S.

For each subpopulation, if there are compartments that are not listed in SetInitialConditions, an error will be thrown unless allow_missing_compartments is set to TRUE, in which case it will be assumed there are zero individuals in them. If the sum of the values of the initial conditions in all compartments in a location does not add up to the total population of that location (specified in the geodata file), an error will be thrown. To allocate all remaining individuals in a subpopulation (the difference between the total population size and those allocated by defined initial conditions) to a single pre-specified compartment, include this compartment in the initial_conditions_file but instead of a number in the amount column, put the word "rest" ;

If allow_missing_subpops is FALSE or unspecified, an error will occur if initial conditions for some subpopulations are missing. If TRUE, then for subpopulations missing from the initial_conditions file, it will be assumed that all individuals begin in the first compartment. (The “first” compartment depends on how the model was specified, and will be the compartment that contains the first named category in each compartment group.)

hashtag
FromFile

Similar to "SetInitialConditions", with this method users can specify arbitrary initial conditions in a formatted .csv or .parquet input file. However, the format of the input file is different. The required file format is consistent with the from the compartmental model, so the user could take output from one simulation and use it as input into another simulation with the same model structure ;

For example, for an input configuration file containing

with the accompanying geodata file

where initial_conditions_from_previous.csv contains

The simulation would be initiated on 2021-06-01 with these values in each compartment (no children vaccinated, only adults in the small province vaccinated, some past and current infection in both compartments but ).

initial_conditions::initial_conditions_file must contain the following columns:

  • mc_value_type – in model output files, this is either prevalence or incidence. Prevalence values only are selected to be used as initial conditions, since compartmental models described the prevalence (number of individuals at any given time) in each compartment. Prevalence is taken to be the value measured instantaneously at the start of the day

  • mc_name – The name of the compartment for which the value is reported, which is a concatenation of the compartment status in each state type, e.g. "S_adult_unvaxxed" and must be in the same order as these groups are defined in the config for the model, e.g., you cannot say unvaxxed_S_adult

hashtag
SetInitialConditionsFolderDraw, FromFileFolderDraw

The way that initial conditions is specified with SetInitialConditions and FromFile results in a single value for each compartment and does not easily allow the user to instead specify a distribution (like is possible for compartmental or outcome model parameters). If a user wants to use different possible initial condition values each time the model is run, the way to do this is to instead specify a folder containing a set of file with initial condition values for each simulation that will be run. The user can do this using files with the format described in initial_conditions::method::SetInitialConditions using instead method::SetInitialConditionsFolder draw. Similarly, to provide a folder of initial condition files with the format described in initial_conditions::method:FromFile using instead method::FromFileFolderDraw ;

Each file in the folder needs to be named according to the same naming conventions as the model output files: run_number.runID.file_type.[csv or parquet] where ....[DESCRIBE] as it is now taking the place of the seeding files the model would normally outpu ;

Only one additional config argument is needed to use a FolderDraw method for initial conditions:

initial_file_type: either seir or seed

When using FolderDraw methods, initial_conditions_file should now be the path to the directory that contains the folder with all the initial conditions files. For example, if you are using output from another model run and so the files are in an seir folder within a model_output folder which is in within your project directory, you would use initial_conditions_file: model_outpu ;

Input description

Input is a list of compartment names, location names, and amounts of individuals in that compartment location. All compartments must be listed unless a setting to default missing compartments to zero is turned on.

Input is list of seeding events defined by source compartment, destination compartment, number of individuals transitioning, and date of movement. Compartments without seeding events don't need to be listed.

Specifies an incidence or prevalence?

Amounts specified are prevalence values

Amounts specified are instantaneous incidence values

Useful for?

Specifying initial conditions, especially if simulation does not start with a single infection introduced into a naive population.

Modeling importations, evolution of new strains, and specifying initial conditions

amount – the value of the initial condition; either a numeric value or the string "rest".

.
  • subpop_1, subpop_2, etc. – one column for each different subpopulation, containing the value of the number of individuals in the described compartment in that subpopulation at the given date. Note that these are named after the nodenames defined by the user in the geodata file.

  • date – The calendar date in the simulation, in YYYY-MM-DD format. Only values with a date that matches to the simulation start_date will be used ;

  • Config section optional or required?

    Optional

    Optional

    Function of section

    Specify number of individuals in each compartment at time zero

    Allow for instantaneous changes in individuals' states

    Default

    Entire population in first compartment, zero in all other compartments

    No seeding events

    Requires input file?

    Yes, .csv

    the seeding section
    output "seir" file

    Yes, .csv

     compartments:
       infection_stage: ["S", "I", "R"]
       age_group: ["child", "adult"]
       vaccination_status: ["unvaxxed", "vaxxed"]
     
     initial_conditions:
       method: default
    subpop,          population
    large_province, 10000
    small_province, 1000
     compartments:
       infection_stage: ["S", "I", "R"]
       age_group: ["child", "adult"]
       vaccination_status: ["unvaxxed", "vaxxed"]
       
    initial_conditions:
        method: SetInitialConditions
        initial_conditions_file: initial_conditions.csv
        allow_missing_subpops: TRUE
        allow_missing_compartments: TRUE
    subpop,          population
    large_province, 10000
    small_province, 1000
    subpop, mc_name, amount
    small_province, S_child_unvaxxed, 500
    small_province, S_adult_unvaxxed, 500
    large_province, S_child_unvaxxed, 5000
    large_province, E_adult_unvaxxed, 5
    large_province, S_adult_unvaxxed, "rest"
    name: test_simulation
    start_date: 2021-06-01
    
     compartments:
       infection_stage: ["S", "I", "R"]
       age_group: ["child", "adult"]
       vaccination_status: ["unvaxxed", "vaxxed"]
       
    initial_conditions:
        method: FromFile
        initial_conditions_file: initial_conditions_from_previous.csv
        allow_missing_compartments: FALSE
        allow_missing_subpops: FALSE
    subpop,          population
    large_province, 10000
    small_province, 1000
    mc_value_type, mc_infection_stage, mc_age, mc_vaccination_status, mc_name, small_province, large_province, date
    ....
    prevalence, S, child, unvaxxed, 400, 900, 2021-06-01
    prevalence, S, child, vaxxed, 0, 0, 2021-06-01
    prevalence, I, child, unvaxxed, 5, 100, 2021-06-01
    prevalence, I, child, vaxxed, 0, 0, 2021-06-01
    prevalence, R, child, unvaxxed, 95, 4000, 2021-06-01
    prevalence, R, child, vaxxed, 0, 0, 2021-06-01
    prevalence, S, adult, unvaxxed, 50, 900, 2021-06-01
    prevalence, S, adult, vaxxed, 400, 0, 2021-06-01
    prevalence, I, adult, unvaxxed, 4, 100, 2021-06-01
    prevalence, I, adult, vaxxed, 1, 0, 2021-06-01
    prevalence, R, adult, unvaxxed, 75, 4000, 2021-06-01
    prevalence, R, adult, vaxxed, 20, 0, 2021-06-01
    ...

    Specifying population structure

    This page describes how users specify the names, sizes, and connectivities of the different subpopulations comprising the total population to be modeled

    hashtag
    Overview

    The subpop_setup section of the configuration file is where users can input the information required to define a population structure on which to simulate the model. The options allow the user to determine the population size of each subpopulation that makes up the overall population, and to specify the amount of mixing that occurs between each pair of subpopulations.

    An example configuration file with the global header and the spatial_setup section is below:

    hashtag
    Items and options

    Config Item
    Required?
    Type/Format
    Description

    hashtag
    geodata file and selected option

    • geodata is a .csv with column headers, with at least two columns: subpop and population.

    • selected if provided, is the subset of locations in geodata file (as determined by subpop column) to be modeled. Requesting subpopulation(s) that are not present will lead to an error.

    hashtag
    Example geodata file format

    hashtag
    mobility file

    The mobility file is a .csv file (it has to contain .csv as extension) with long form comma separated values. Columns have to be named ori, dest, amount, with amount being the average number individuals moving from the origin subpopulation ori to destination subpopulation dest on any given day. Details on the mathematics of this model of contact are explained in the . Unassigned relations are assumed to be zero. The location entries in the ori and dest columns should correspond to an entry in the subpop column in geodata.csv. When using selected

    hashtag
    Example mobility file format

    It is also possible, but not recommended to specify the mobility file as a .txt with space-separated values in the shape of a matrix. This matrix is symmetric and of size K x K, with K being the number of rows in geodata. The above example corresponds to

    hashtag
    Examples

    hashtag
    Example 1

    To simulate a simple population structure with two subpopulations, a large province with 10,000 individuals and a small province with only 1,000 individuals, where every day 100 residents of the large province travel to the small province and interact with residents there, and 50 residents of the small province visit the large province

    geodata.csv contains the population structure (with columns subpop and population)

    mobility.csv contains

    Other configuration options

    hashtag
    Command line inputs

    flepiMoP allows some input parameters/options to be specified in the command line at the time of model submission, in addition to or instead of in the configuration file. This can be helpful for users who want to quickly run different versions of the model – typically a different number of simulations or a different intervention scenario from among all those specified in the config – without having to edit or create a new configuration file every time. In addition, some arguments can only be specified via the command line.

    In addition to the configuration file and the command line, the inputs described below can also be specified as environmental variables.

    In all cases, command line arguments override configuration file entries which override environmental variables. The order of command line arguments does not matter.

    Details on how to run the model, including how to add command line arguments or environmental variables, are in the section .

    hashtag
    Command-line only inputs

    Argument
    Env. Variable
    Value type
    Description
    Required?
    Default

    hashtag
    Command-line versions of configuration file inputs

    Argument
    Config item
    Env. Variable
    Value type
    Description
    Required?
    Default

    hashtag
    Example

    As an example, consider running the following configuration file

    To run this model directly in Python (it can alternatively be run from R, for all details see section ), we could use the command line entry

    Alternatively, to run 100 simulations using only 4 of the available processors on our computer, but only running the "" scenario with a deterministic model, and to save the files as .csv (since the model is relatively simple), we could call the model using the command line entry

    hashtag
    Environmental variables

    TBA

    hashtag
    US-specific configuration file options

    circle-exclamation

    Things below here are very out of date. Put here as place holder but not updated recently.

    global: smh_round, setup_name, disease

    spatial_setup: census_year, modeled_states, state_level

    hashtag
    For US-specific population structures

    For creating US-based population structures using the helper script build_US_setup.R which is run before the main model simulation script, the following extra parameters can be specified

    Config Item
    Required?
    Type/Format
    Description

    hashtag
    Example 2

    To simulate an epidemic across all 50 states of the US or a subset of them, users can take advantage of built in machinery to create geodata and mobility files for the US based on the population size and number of daily commuting trips reported in the US Census.

    Before running the simulation, the script build_US_setup.R can be run to get the required population data files from online census data and filter out only states/territories of interest for the model. More details are provided in the How to Run section.

    This example simulates COVID-19 in the New England states, assuming no transmission from other states, using 2019 census data for the population sizes and a pre-created file for estimated interstate commutes during the 2011-2015 period.

    geodata.csv contains

    mobility_2011-2015_statelevel.csv contains

    hashtag
    importation section (optional)

    This section is optional. It is used by the to import global air importation data for seeding infections into the United States.

    If you wish to include it, here are the options.

    Config Item
    Required?
    Type/Format
    Description

    hashtag
    importation::param_list

    Config Item
    Required?
    Type/Format
    Description

    hashtag
    report section

    The report section is completely optional and provides settings for making an R Markdown report. For an example of a report, see the Supplementary Material of our

    If you wish to include it, here are the options.

    Config Item
    Required?
    Type/Format
    Description

    flepiMoP's configuration file

    hashtag
    About configuration files

    flepiMop is set up so that all parameters and other options for running the pipeline can be specified in a single "configuration" file (aka "config"). Users do not need to edit any other code files, or even be aware of their contents, to create and run complex model scenarios. Configuration files also provide a convenient record of model options and promote reproducibility of model results.

    We use the YAML language syntax to write config files, which are typically named something like config.yml. The file has simple plain text contents and follows a tabbed outline structure. When config files are read by the model code, a data structure encoding the model options is created.

    Comments can be added to the config file by starting with the hash key (#) then a space. Comments can start anywhere on a line and continue until the end, but if they run over to a new line, a new # must be used at the start of the new line.

    hashtag
    Example

    (give a simple configuration for a toy model with two subpopulations, SEIR, single "cases" outcome, single seeded infection, single NPI that starts after some time? this page is currently under development, please see our _for some simple configurations) ;

    When referring to config items (individual parameters), we use their full position in the outline. For example, in the sample config file above, we denote

    as subpop_setup::geodata having a value of minimal

    hashtag
    Notation

    Parameters and other options specified in the configuration files can take on a variety of types of values, using the following notations:

    • dates are specified as [year]-[month]-[day]. (e.g., 2020-01-31)

    • boolean values are either "TRUE" or "FALSE"

    • files names are strings

    hashtag
    Configuration files sections

    hashtag
    Global header

    circle-check

    Required section

    These global configuration options typically sit at the top of the configuration file.

    Item
    Required?
    Type/Format
    Description

    For example, for a configuration file to simulate the spread of COVID-19 in the US during 2020 and compare to data from March 1 onwards, with 1000 independent simulations, the header of the config might read:

    hashtag
    subpop_setup section

    circle-check

    Required section

    This section specifies the population structure on which the model will be simulated, including the names and sizes of each subpopulation and the connectivity between them. More details .

    hashtag
    compartments section

    circle-check

    Required section

    This section is where users can specify the variables (infection states) that will be tracked in the infectious disease transmission model. More details can be found . The other details of the model are specified in the seir section, including transitions between these compartments (seir::transitions), the names of the parameters governing the transitions (seir::parameters), and the numerical method used to simulate the equations over time (seir::integration). The initial conditions of the model can be specified in the initial_conditions section, and any other inputs into the model from external populations or instantaneous transitions between states that occur at later times can be specified in the seeding section. ;

    hashtag
    seir section

    circle-check

    Required section

    This section is where users can specify the details of the infectious disease transmission model they wish to simulate (e.g., SEIR). This model describes the allowed transitions (seir::transitions) between the compartments that were specified in the compartments section, the values of the parameters involved in these transitions (seir::parameters), and the numerical method used to simulate the equations over time (seir::integration). More details . The initial conditions of the model can be specified in the separate initial_conditions section, and any other inputs into the model from external populations or instantaneous transitions between states that occur at later times can be specified in the seeding section. ;

    hashtag
    initial_conditions section

    circle-info

    Optional section

    This section is used to specify the initial conditions of the model, which define how individuals are distributed between the model compartments at the time the model simulation begins. Importantly, the initial conditions specify the time and location where infection is first introduced. If this section is omitted, default values are used. If users want to add infections to the population at later times, or add or remove individuals from compartments separately from the model rules, they can do so via the related seeding section. More details ;

    hashtag
    seeding section

    circle-info

    Optional section

    This section is used to specify how individuals are instantaneously "seeded" from one compartment to another, where they then continue to be governed by the model equations. For example, this seeding could be used to represent importations of infected individuals from an outside population, mutation events that create new strains, or vaccinations that alter disease susceptibility. Seeding events can occur at any time in the simulation. The seeding section specifies the numeric values added to or removed from any compartment of the model. More details ;

    hashtag
    outcomes section

    circle-info

    Optional section

    This section is where users can define new variables representing the observed quantities and how they are related to the underlying state variables in the model (e.g., the fraction of infections that are detected as cases). More details ;

    hashtag
    interventions section

    circle-check

    Required section

    This section is where users can specify time-varying changes to parameters governing either the infectious disease model or the observational model. More details ;

    hashtag
    inference section

    circle-info

    Optional section

    This section is where users can specify the details of how the model is fit to data, including what data streams they will be included and which outcome variables they represent and the likelihood functions describing the probability of the data given the model. More details . ;

    name: test_simulation
    model_output_dirname: model_output
    start_date: 2020-01-01
    end_date: 2020-12-31
    nslots: 100
    
    subpop_setup:
      geodata: model_input/geodata.csv
      mobility: model_input/mobility.csv
    , the
    mobility
    data will also be filtered.

    geodata

    required

    path to file

    path to geodata file

    mobility

    required

    path to file

    path to mobility file

    selected

    optional

    string or list of strings

    Model Description section

    name of selected location ingeodata

    probability is a float between 0 and 1

  • distribution is a probability distribution from which a random value for the parameter is drawn each time a new simulation is run (or chain, if doing inference). See here for the require schema.

  • start_date_groundtruth

    optional for non-inference runs, required for inference runs

    date

    start date for comparing model to data

    end_date_groundtruth

    optional for non-inference runs, required for inference runs

    date

    end date for comparing model to data

    nslots

    optional (can also be defined by an environmental variable)

    int

    number of independent simulations to run

    setup_name

    optional

    string

    setup name used to describe the run, used in setting up file names

    model_output_dirname

    optional

    folder path

    path to folder where all the outputs created by the model are stored, if not specified, default is model_output

    name

    required

    string

    Name of this configuration. Will be used in file names created to store model output.

    start_date

    required

    date

    model simulation start date

    end_date

    required

    date

    example repoarrow-up-right
    here
    here
    here
    here
    here
    here
    here
    here

    model simulation end date

    subpop,population
    10001,1000
    20002,2000
    ori, dest, amount
    10001, 20002, 3
    20002, 10001, 3
    0 3
    3 0
    subpop_setup:
      geodata: model_input/geodata.csv
      mobility: model_input/mobility.csv
    subpop,          population
    large_province, 10000
    small_province, 1000
    ori,            dest,           amount
    large_province, small_province, 100
    small_province, large_province, 50
    subpop_setup:
      ...
      geodata: minimal
    name: USA_covid19_2020
    model_output_dirname: model_output
    start_date: 2020-01-01
    end_date: 2020-12-31
    start_date_groundtruth: 2020-03-01
    end_date_groundtruth: 2020-12-31
    nslots: 1000

    No

    1

    -j or --jobs

    FLEPI_NJOBS

    integar 1

    Number of parallel processors used to run the simulation. If there are more slots that jobs, slots will be divided up between processors and run in series on each.

    No

    Number of processors on the computer used to run the simulation

    --interactiveor --batch

    NA

    Choose either option

    Run simulation in interactive or batch mode

    No

    batch

    --write-csv or --no-write-csv

    NA

    Choose either option

    Whether model output will be saved as .csv files

    No

    no_write_csv

    --write-parquet or --no-write-parquet

    NA

    Choose either option

    Whether model output will be saved as .parquet files (a compressed representation that can be opened and manipulated with minimal memory. May be required for large simulations). Read more about .

    No

    write_parquet

    FLEPI_NUM_SLOTS

    integar 1

    Number of independent simulations of the model to be run

    No

    Config value

    --method or -m

    seir: integration: method

    `rk4`, `euler`, or `stochastic`

    If provided, will override the `seir::integration::method` (including the default, if unspecified in the configuration file)

    No

    Config value if present, otherwise `rk4`

    --in-id

    FLEPI_RUN_INDEX

    string

    Unique ID given to the model runs. If the same config is run multiple times, you can avoid the output being overwritten by using unique model run IDs.

    No

    Constructed from current date and time as YYYY.MM.DD.HH/MM/SS

    --out-id

    FLEPI_RUN_INDEX

    string

    Unique ID given to the model runs. If the same config is run multiple times, you can avoid the output being overwritten by using unique model run IDs.

    No

    Constructed from current date and time as YYYY.MM.DD.HH/MM/SS

    dest_type

    required

    categorical

    location type

    dest_country

    required

    string (Country)

    ISO3 code for country of importation. Currently only USA is supported

    aggregate_to

    required

    categorical

    location type to aggregate to

    cache_work

    required

    boolean

    whether to save case data

    update_case_data

    required

    boolean

    deprecated; whether to update the case data or used saved

    draw_travel_from_distribution

    required

    boolean

    whether to add additional stochasticity to travel data; default is FALSE

    print_progress

    required

    boolean

    whether to print progress of importation model simulations

    travelers_threshold

    required

    integer

    include airports with at least the travelers_threshold mean daily number of travelers

    airport_cluster_distance

    required

    numeric

    cluster airports within airport_cluster_distance km

    param_list

    required

    See section below

    see below

    inf_period_nohosp_sd

    required

    numeric

    infectious period, non-hospitalized, sd

    inf_period_hosp_mean_log

    required

    numeric

    infectious period, hospitalized, log-normal mean

    inf_period_hosp_sd_log

    required

    numeric

    infectious period, hospitalized, log-normal sd

    p_report_source

    required

    numeric

    reporting probability, Hubei and elsewhere

    shift_incid_days

    required

    numeric

    mean delay from infection to reporting of cases; default = -10

    delta

    required

    numeric

    days per estimations period

    formatting::scenario_labels

    list of strings; one for each scenario in interventions::scenarios

    formatting::scenario_colors

    list of strings; one for each scenario in interventions::scenarios

    formatting::pdeath_labels

    list of strings

    formatting::display_dates

    list of dates

    formatting::display_dates2

    optional

    list of dates

    a 2nd string of display dates that can optionally be supplied to specific report functions

    -c or --config

    CONFIG_PATH

    file path

    Name of configuration file. Must be located in the current working directory, or else relative or absolute file path must be provided.

    Yes

    NA

    -i or --first_sim_index

    FIRST_SIM_INDEX

    integar ≥\geq≥1

    -s or --npi_scenario

    interventions: scenarios

    FLEPI_NPI_SCENARIOS

    list of strings

    Names of the intervention scenarios described in the config file that will be run. Must be a subset of scenarios defined.

    No

    All scenarios described in config

    -n or --nslots

    census_year

    optional

    integer (year)

    Determines the year for which census population size data is pulled.

    state_level

    optional

    boolean

    Determines whether county-level population-size data is instead grouped into state-level data (TRUE). Default FALSE

    modeled_states

    optional

    list of location codes

    census_api_key

    required

    string

    get an API keyarrow-up-right

    travel_dispersion

    required

    number

    ow dispersed daily travel data is; default = 3.

    maximum_destinations

    required

    integer

    incub_mean_log

    required

    numeric

    incubation period, log mean

    incub_sd_log

    required

    numeric

    incubation period, log standard deviation

    inf_period_nohosp_mean

    required

    numeric

    data_settings::pop_year

    integer

    plot_settings::plot_intervention

    boolean

    formatting::scenario_labels_short

    list of strings; one for each scenario in interventions::scenarios

    How to Runarrow-up-right
    How to Runarrow-up-right
    covidImportation packagearrow-up-right
    preprintarrow-up-right

    The index of the first simulation

    nslots

    A vector of locations that will be modeled; others will be ignored

    number of airports to limit importation to

    infectious period, non-hospitalized, mean

    Specifying seeding

    This section describes how to specify the values of each model state at the time the simulation starts, and how to make instantaneous changes to state values at other times (e.g., due to importations)

    hashtag
    Overview

    flepiMoP allows users to specify instantaneous changes in values of model variables, at any time during the simulation. We call this "seeding". For example, some individuals in the population may travel or otherwise acquire infection from outside the population throughout the epidemic, and this importation of infection could be specified with the seeding option. As another example, new genetic variants of the pathogen may arise due to mutation and selection that occurs within infected individuals, and this generation of new strains can also be modeled with seeding. Seeding allows individuals to change state at specified times in ways that do not depend on the model equations. In the first example, the individuals would be "seeded" into the infected compartment from the susceptible compartment, and in the second example, individuals would be seeded into the "infected with new variant" compartment from the "infected with wild type" compartment.

    The seeding option can also be used as a convenient alternative way to specify . By default, flepiMoP initiates models by putting the entire population size (specified in the geodata file) in the first model compartment. If the desired initial condition is only slightly different than the default state, it may be more convenient to specify it with a few "seedings" that occur on the first day of the simulation. For example, for a simple SIR model where the desired initial condition is just a small number of infected individuals, this could be specified by a single seeding into the infected compartment from the susceptible compartment at time zero, instead of specifying the initial values of three separate compartments. For larger models, the difference becomes more relevant.

    hashtag
    Specifying model seeding

    The configuration items in the seeding section of the config file are

    seeding:method Must be either "NoSeeding", "FromFile", "PoissonDistributed", "NegativeBinomialDistributed", or "FolderDraw".

    seeding::seeding_file Only required for method: “FromFile”. Path to a .csv file containing the list of seeding events

    seeding::lambda_file Only required for methods "PoissonDistributed" or "NegativeBinomialDistributed". Path to a .csv file containing the list of the events from which the actual seeding will be randomly drawn.

    seeding::seeding_file_type Only required for method "FolderDraw". Either seir or seed

    Details on implementing each seeding method and the options that go along with it are below.

    hashtag
    seeding::method

    hashtag
    NoSeeding

    If there is no seeding, then the amount of individuals in each compartment will be initiated using the values specified in theinitial_conditions section and will only be changed at later times based on the equations defined in the seir section. No other arguments are needed in the seeding section in this case

    Example

    hashtag
    FromFile

    This seeding method reads in a user-defined file with a list of seeding events (instantaneous transitions of individuals between compartments) including the time of the event and subpopulation where it occurs, and the source and destination compartment of the individuals. For example, for the simple two-subpopulation SIR model where the outbreak starts with 5 individuals in the small province being infected from a source outside the population, the seeding section of the config could be specified as

    Where seeding.csv contains

    seeding::seeding_file must contain the following columns:

    • subpop – the name of the subpopulation in which the seeding event takes place. Seeding cannot move individuals between different subpopulations.

    • date – the date the seeding event occurs, in YYYY-MM-DD format

    • amount

    hashtag
    PoissonDistributed or NegativeBinomialDistributed

    These methods are very similar to FromFile, except the seeding value used in the simulation is randomly drawn from the seeding value specified in the file, with an average value equal to the file value. These methods can be useful when the true seeded value is unknown, and only an observed value is available which is assumed to be observed with some uncertainty. The input requirements are the same for both distributions

    or

    and the lambda_file has the same format requirements as the seeding_file for the FromFile method described above.

    For method::PoissonDistributed, the seeding value for each seeding event is drawn from a Poisson distribution with mean and variance equal to the value in the amount column. Formethod::NegativeBinomialDistributed, seeding is drawn from a negative binomial distribution with mean amount and variance amount+5 (so identical to "PoissonDistributed" for large values of amount but has higher variance for small values).

    hashtag
    FolderDraw

    TB ;

    Specifying observational model

    This page describes how to specify the outcomes section of the configuration file

    hashtag
    Thinking about outcomes variables

    Our pipeline allows users to encode state variables describing the infection status of individuals in the population in two different ways. The first way is via the state variables and transitions of the compartmental model of disease transmission, which are specified in the compartments and seir sections of the config. This model should include all variables that influence the natural course of the epidemic (i.e., all variables that feed back into the model by influencing the rate of change of other variables). For example, the number of infected individuals influences the rate at which new infections occur, and the number of immune individuals influences the number of individuals at risk of acquiring infection.

    However, these intrinsic model variables may be difficult to observe in the real world and so directly comparing model predictions about the values of these variables to data might not make sense. Instead, the observable outcomes of infection may include only a subset of individuals in any state, and may only be observed with a time delay. Thus, we allow users to define new outcome variables that are functions of the underlying model variables. Commonly used examples include detected cases or hospitalizations ;

    Variables should not be included as outcomes if they influence the infection trajectory. The choice of what variables to include in the compartmental disease model vs. the outcomes section may be very model specific. For example, hospitalizations due to infection could be encoded as an outcome variable that is some fraction of infections, but if we believe hospitalized individuals are isolated from the population and don't contribute to onward infection, or that the number of hospitalizations feeds back into the population's perception of risk of infection and influences everyone's contact behavior, this would not be the best choice. Similarly, we could include deaths due to infection as an outcome variable that is also some fraction of infections, but unless death is a very rare outcome of infection and we aren't worried about actually removing deceased individuals from the modeled populations, deaths should be in the compartmental model instead.

    The outcomes section is not required in the config. However, there are benefits to including it, even if the only outcome variable is set to be equivalent to one of the infection model variables. If the compartmental model is complicated but you only want to visualize a few output variables, the will be much easier to work with. Outcome variables always occur with some fixed delay from their source infection model variable, which can be more convenient than the exponential distribution underlying the infection model. Outcome variables can be created to automatically sum over multiple compartments of the infection model, removing the need for post-processing code to do this. If the model is being fit to data, then the outcomes section is required, as only outcome variables can be compared to data.

    As an example, imagine we are simulating an SIR-style model and want to compare it to real epidemic data in which cases of infection and death from infection are reported. Our model doesn't explicitly include death, but suppose we know that 1% of all infections eventually lead to hospitalization, and that hospitalization occurs on average 1 week after infection. We know that not all infections are reported as cases, and assume that only 50% are detected and are reported 2 days after infection begins. The model and outcomes section of the config for these outcomes, which we call incidC (daily incidence of cases) and incidH (daily incidence of hospital admission) would be

    in the following sections we describe in more detail how this specification works

    hashtag
    Specifying outcomes in the configuration file

    The outcomes config section consists of a list of defined outcome variables (observables), which are defined by a user-created name (e.g., "incidH"). For each of these outcome variables, the user defines the source compartment(s) in the infectious disease model that they draw from and whether they draw from the incidence (new individuals entering into that compartment) or prevalence (total current individuals in that compartment). Each new outcome variable is always associated with two mandatory parameters ;

    • probability of being counted in this outcome variable if in the source compartment

    • ;delay between when an individual enters the source compartment and when they are counted in the outcome variable

    and one optional parameter

    • duration after entering that an individual is counted as part of the outcome variable

    The value of the probability, delay, and duration parameters can be a single value or come from ;

    Outcome model parameters probability, delay, and distribution can have an additional attribute beyond value called modifier_key. This value is explained in the section on coding (also known as "modifiers") as it provides a way to have the same modifier act on multiple different outcomes ;

    circle-info

    Just like the case for , when outcome parameters are drawn from a distribution, each time the model is run, a different value for this parameter will be drawn from this distribution, but that value will be used for all calculations within this model run. Note that understanding when a new parameter values from this distribution is drawn becomes more complicated when the model is run in mode. In Inference mode, we distinguish model runs as occurring in different "slots" – i.e., completely independent model instances that could be run on different processing cores in a parallel computing environment – and different "iterations" of the model that occur sequentially when the model is being fit to data and update fitted parameters each time based on the fit quality found in the previous iteration. A new parameter values is only drawn from the above distribution once per slot. Within a slot, at each iteration during an inference run, the parameter is only changed if it is being fit and the inference algorithm decides to perturb it to test a possible improved fit. Otherwise, it would maintain the same value no matter how many times the model was run within a slot.

    Example

    hashtag

    Config item
    Required?
    Type/format
    Description

    hashtag
    `source ;

    Required, unless option is used instead. This sub-section describes the compartment(s) in the infectious disease model from which this outcome variable is drawn. Outcome variables can be drawn from the incidence of a variable - meaning that some fraction of new individuals entering the infection model state each day are chosen to contribute to the outcome variable - or from the prevalence, meaning that each day some fraction of individuals currently in the infection state are chosen to contribute to the outcome variable. Note that whatever the source type, the named outcome variable itself is always a measure of incidence ;

    To specify which compartment(s) contribute the user must specify the state(s) within each model stratification. For stratifications not mentioned, the outcome will sum over that states in all strata ;

    For example, consider a configuration in which the compartmental model was constructed to track infection status stratified by vaccination status and age group. The following code would be used to create an outcome called incidH_child (incidence of hospitalization for children) and incidH_adult (incidence of hospitalization for adults) where some fraction of infected individuals would become hospitalized and we wanted to track separately track pediatric vs adult hospitalizations, but did not care about tracking the vaccination status of hospitalized individuals as in reality it was not tracked by the hospitals ;

    to instead create an outcome variable for cases where on each day of infection there is some probability of testing positive (for example, for the situation of an asymptomatic infection where testing is administered totally randomly), the following code would be used

    The source of an outcome variable can also be a previous defined outcome variable. For example, t to create a new variable for the number of individuals recruited to be part of a contact tracing program (incidT), which is just some fraction of diagnosed cases ;

    hashtag
    `probability ;

    Required, unless option is used instead. Probability is the fraction of individuals in the source compartment who are counted as part of this outcome variable (if the source is incidence; if the source is prevalence it is the fraction of individuals per day). It must be between 0 and 1 ;

    Specifying the probability creates a parameter called outcome_name::probability that can be referred to in the section of the config. The value of this parameter can be changed using the probability::intervention_param_name option ;

    For example, to track the incidence of hospitalization when 5% of children but only 1% of adults infected require hospitalization, and to create a modifier_key such that both of these rates could be modified by the same amount during some time period using the section:

    To track the incidence of diagnosed cases iterating over uncertainty in the case detection rate (ranging 20% to 30%), and naming this parameter "case_detect_rate"

    Each time the model is run a new random value for the probability of case detection will be chosen ;

    hashtag
    Delay

    Required, unless option is used instead. delay is the time delay between when individuals are chosen from the source compartment and when they are counted as part of this outcome variable ;

    For example, to track the incidence of hospitalization when 5% of children are hospitalized and hospitalization occurs 7 days after infection:

    To iterate over uncertainty in the exact delay time, we could include some variation between simulations in the delay time using a normal distribution with standard deviation of 2 (truncating to make sure the delay does not become negative). Note that a delay distribution here does not mean that the delay time varies between individuals - it is identical) ;

    hashtag
    Duration

    By default, all outcome variables describe incidence (new individuals entering each day). However, they can also track an associated "prevalence" if the user specifies how long individuals will stay classified as the outcome state the outcome variable describes. This is the duration parameter ;

    When the duration parameter is set, a new outcome variable is automatically created and named with the name of the original outcome variable + "_curr". This name can be changed using the duration::name option ;

    For example, to track the incidence and prevalence of hospitalization when 5% of children are hospitalized, hospitalization occurs 7 days after infection, and the duration of hospitalization is 3 days:

    which creates the variable "incidH_child_curr" to track all currently hospitalized children. Since it doesn't make sense to call this new outcome variable an incidence, as it is a prevalence, we could instead rename it:

    hashtag
    Sum

    Optional. sum is used to create new outcome variables that are sums over other previously defined outcome variables ;

    If sum is included, source, probability, delay, and duration will be ignored ;

    For example, to track new hospital admissions and current hospitalizations separately for children and adults, as well as for all ages combined

    hashtag
    outcomes::settings

    There are other required and optional configuration items for the outcomes section which can be specified under outcomes::settings:

    method: delayframe.This is the mathematical method used to create the outcomes variable values from the transmission model variables. Currently, the only model supported is delayframe, which .. ;

    param_from_file: Optional, TRUE or FALSE. It is possible to allow any of the outcomes variables to have values that vary across the subpopulations. For example, disease severity rates or diagnosis rates may differ by demographic group. In this case, all the outcome parameter values defined in will represent baseline values, and then you can define a relative change from this baseline for any particular subpopulation using the paths section. If params_from_file: TRUE is specified, then these relative values will be read from the params_subpop_file. Otherwise, if params_from_file: FALSE or is not listed at all, all subpopulations will have the same values for the outcome parameters, defined below ;

    param_subpop_file: Required if params_from_file: TRUE. The path to a .csv or .parquet file that contains the relative amount by which a given outcome variable is shifted relative to baseline in each subpopulation. File must contain the following columns:

    • subpop: The subpopulation for which the parameter change applies. Must be a subpopulation defined in the file. For example, small_province

    • parameter: The outcomes parameter which will be altered for this subpopulation. For example, incidH_child: probability

    hashtag
    Examples

    Consider a disease described by an SIR model in a population that is divided into two age groups, adults and children, which experience the disease separately. We are interested in comparing the predictions of the model to real world data, but we know we cannot observe every infected individual. Instead, we have two types of outcomes that are observed.

    First, via syndromic surveillance, we have a database that records how many individuals in the population are experiencing symptoms from the disease at any given time. Suppose careful cohort studies have shown that 50% of infected adults and 80% of infected children will develop symptoms, and that symptoms occur in both age groups around 3 days after infection (following a log-normal distribution with log mean X and log standard deviation of Y). The duration that symptoms persist is also a variable, following a ...

    Secondly, via laboratory surveillance we have a database of every positive test result for the infection. We assume the test is 100% sensitive and specific. Only individuals with symptoms are tested, and they are always tested exactly 1 day after their symptom onset. We are unsure what portion of symptomatic individuals are seeking out testing, but are interested in considering two extreme scenarios: 95% of symptomatic individuals are tested, or only 75% of individuals are tested.

    The configuration file we could use to model this situation includes

    Specifying compartmental model

    This section describes how to specify the compartmental model of infectious disease transmission.

    We want to allow users to work with a wide variety of infectious diseases or, one infectious disease under a wide variety of modeling assumptions. To facilitate this, we allow the user to specify their compartmental model of disease dynamics via the configuration file.

    We originally considered asking users to specify each compartment and transition manually. However, we quickly found that this created long, confusing configuration files, and so we created a shorthand to more succinctly specify both compartments and transitions between them. This works especially well for models where individuals are stratified by other properties (like age, vaccination status, etc.) in addition to their infection status.

    The model is specified in two separate sections of the configuration file. In the compartments section, users define the possible states individuals can be categorized into. Then in the seir section, users define the possible transitions between states, the values of parameters that govern the rates of these transitions, and the numerical method used to simulate the model.

    An example section of a configuration file defining a simple SIR model is below.

    hashtag
    Specifying model compartments (compartments)

    The first stage of specifying the model is to define the infection states (variables) that the model will track. These "compartments" are defined first in the compartments section of the config file, before describing the processes that lead to transitions between them. The compartments are defined separately from the rest of the model because they are also used by the seeding section that defines initial conditions and importations.

    For simple disease models, the compartments can simply be listed with whatever notation the user chooses. For example, for a simple SIR model, the compartments could be ["S", "I", "R"]. The config also requires that there be a variable name for the property of the individual that these compartments describe, which for example in this case could be infection_stage

    Our syntax allows for more complex models to be specified without much additional notation. For example, consider a model of a disease that followed SIR dynamics but for which individuals could receive vaccination, which might change how they experience infection.

    In this case we can specify compartments as the cross product of multiple states of interest. For example:

    Corresponds to 6 compartments, which the code internally converts to this data frame

    In order to more easily describe transitions, we want to be able to refer to a compartment by its components, but then use it by its compartment name.

    If the user wants to specify a model in which some compartments are repeated across states but others are not, there will be pros and cons of how the model is specified. Specifying it using the cross product notation is simpler, less error prone, and makes config files easier to read, and there is no issue with having compartments that have zero individuals in them throughout the model. However, for very large models, extra compartments increase the memory required to conduct the simulation, and so having unnecessary compartments tracked may not be desired.

    For example, consider a model of a disease that follows SI dynamics in two separate age groups (children and adults), but for which only adults receive vaccination, with one or two doses of vaccine. With the simplified notation, this model could be specified as:

    corresponding to 12 compartments, 4 of which are unnecessary to the model

    Or, it could be specified with the less concise notation

    which does not result in any unnecessary compartments being included.

    These compartments are referenced in multiple different subsequent sections of the config. In the seeding (LINK TBA) section the user can specify how the initial (or later imported) infections are distributed across compartments; in the section the user can specify the form and rate of the transitions between these compartments encoded by the model; in the section the user can specify how the observed variables are generated from the underlying model states.

    Notation must be consistent between these sections.

    hashtag
    Specifying compartmental model transitions (seir::transitions)

    The way we specify transitions between compartments in the model is a bit more complicated than how the compartments themselves are specified, but allows users to specify complex stratified infectious disease models with minimal code. This makes checking, sharing, and updating models more efficient and less error-prone.

    We specify one or more transition globs, each of which corresponds to one or more transitions. Since transition globs are shorthand for collections of transitions, we will first explain how to specify a single transition before discussing transition globs.

    A transition has 5 pieces of associated information that a user can specify:

    • source

    • destination

    • rate

    For more details on the mathematical forms possible for transitions in our models, read the .

    We first consider a simple example of an SI model where individuals may either be vaccinated (v) or unvaccinated (u), but the vaccine does not change the susceptibility to infection nor the infectiousness of infected individuals.

    We will focus on describing the first transition of this model, the rate at which unvaccinated individuals move from the susceptible to infected state.

    hashtag
    Specifying a single transition

    hashtag
    Source

    The compartment the transition moves individuals out of (e.g., the source compartment) is an array. For example, to describe a transition that moves unvaccinated susceptible individuals to another state, we would write

    which corresponds to the compartment S_unvaccinated

    hashtag
    Destination

    The compartment the transition moves individuals into (e.g. the destination compartment) is an array. For example, to describe a transition that moves individuals into the unvaccinated but infected state, we would write

    which corresponds to the compartment I_unvaccinated

    hashtag
    Rate

    The rate constant specifies the probability per time that an individual in the source compartment changes state and moves to the destination compartment. For example, to describe a transition that occurs with rate 5/time, we would write:

    instead, we could describe the rate using a parameter beta, which can be given a numeric value later:

    The interpretation and unit of the rate constant depend on the model details, as the rate may potentially also be per number (or proportion) of individuals in other compartments (see below).

    hashtag
    Proportional to

    A vector of groups of compartments (each of which is an array) that modify the overall rate of transition between the source and destination compartment. Each separate group of compartments in the vector are first summed, and then all entries of the vector are multiplied to get the rate modifier. For example, to specify that the transition rate depends on the product of the number of unvaccinated susceptible individuals and the total infected individuals (vaccinated and unvaccinated), we would write:

    To understand this term, consider the compartments written out as strings

    and then sum the terms in each group

    From here, we can say that the transition we are describing is proportional to S_unvaccinated and I_unvaccinated + I_vaccinated, i.e., the rate depends on the product S_unvaccinated * (I_unvaccinated + I_vaccinated).

    For transitions that occur at a constant per-capita rate (ie, E -> I at rate in an SEIR model), it is possible to simply write proportional_to: ["source"].

    hashtag
    Proportion exponent

    This is an exponent modifying each group of compartments that contribute to the rate. It is equivalent to the "order" term in chemical kinetics. For example, if the reaction rate for the model above depends linearly on the number of unvaccinated susceptible individuals but on the total infected individuals sub-linearly, for example to a power 0.9, we would write:

    or a power parameter alpha, which can be given a numeric value later:

    The (top level) length of the proportion_exponent vector must be the same as the (top level) length of the proportional_to vector, even if the desire of the user is to have the same exponent for all terms being multiplied together to get the rate.

    hashtag
    Summary

    Putting it all together, the model transition is specified as

    would correspond to the following model if expressed as an ordinary differential equation

    with parameter and parameter (we will describe how to use parameter symbols in the transitions and specify their numeric values separately in the section ).

    hashtag
    Transition globs

    We now explain a shorthand we have developed for specifying multiple transitions that have similar forms all at once, via transition globs. The basic idea is that for each component of the single transitions described above where a term corresponded to a single model compartment, we can instead specify one or more compartment. Similarly, multiple rate values can be specified at once, for each involved compartment. From one transition glob, multiple individual transitions are created, by broadcasting across the specified compartments.

    For transition globs, any time you could specify multiple arguments as a list, you may instead specify one argument as a non-list, which will be used for every broadcast. So [1,1,1] is equivalent to 1 if the dimension of that broadcast is 3.

    We continue with the same SI model example, where individuals are stratified by vaccination status, but expand it to allow infection to occur at different rates in vaccinated and unvaccinated individuals:

    hashtag
    Source

    We allow one or more arguments to be specified for each compartment. So to specify the transitions out of both susceptible compartments (S_unvaccinated and S_unvaccinated), we would use

    hashtag
    Destination

    The destination variable should be the same shape as the source, and in the same relative order. So to specify a transition from S_unvaccinated to I_unvaccinated and S_vaccinated to I_vaccinated, we would write the destination as:

    If instead we wrote:

    we would have a transition from S_unvaccinated to I_vaccinated and S_vaccinated to I_unvaccinated.

    hashtag
    Rate

    The rate vector allows users to specify the rate constant for all the source -> destination transitions that are defined in a shorthand way, by instead specifying how the rate is altered depending on the compartment type. For example, the rate of transmission between a susceptible (S) and an infected (I) individual may vary depending on whether the susceptible individual is vaccinated or not and whether the infected individual is vaccinated or not. The overall rate constant is constructed by multiplying together or "broadcasting" all the compartment type-specific terms that are relevant to a given compartment.

    For example,

    This would mean our transition from S_unvaccinated to I_unvaccinated would have a rate of 3 * 0.6 while our transition from S_vaccinated to I_vaccinated would have a rate of 3 * 0.5.

    The rate vector should be the same shape as source and destination and in the same relative order.

    Note that if the desire is to make a model where the difference in the rate constants varies in a more complicated than multiplicative way between different compartment types, it would be better to specify separate transitions for each compartment type instead of using this shorthand.

    hashtag
    Proportional to

    The broadcasting here is a bit more complicated. In other cases, each broadcast is over a single component. However, in this case, we have a broadcast over a group of components. We allow a different group to be chosen for each broadcast.

    Again, let's unpack what it says. Since the broadcast is over groups, let's split the config back up

    into those groups

    From here, we can say that we are describing two transitions. Both occur proportionally to the same compartments: S_unvaccinated and the total number of infections (I_unvaccinated+I_vaccinated).

    If, for example, we want to model a situation where vaccinated susceptibles cannot be infected by unvaccinated individuals, we would instead write:

    hashtag
    Proportion exponent

    Similarly to rate and proportional_to, we provide an exponent for each component and every group across the broadcast. So we could for example use:

    The (top level) length of the proportion_exponent vector must be the same as the (top level) length of the proportional_to vector, even if the desire of the user is to have the same exponent for all terms being multiplied together to get the rate. Within each vector entry, the arrays must have the same length as the source and destination vectors.

    hashtag
    Summary

    Putting it all together, the transition glob

    is equivalent to the following transitions

    hashtag
    Warning

    We warn the user that with this shorthand, it is possible to specify large models with few lines of code in the configuration file. The more compartments and transitions you specify, the longer the model will take to run, and the more memory it will require.

    hashtag
    Specifying compartmental model parameters (seir::parameters)

    When the transitions of the compartmental model are specified as described above, they can either be entered as numeric values (e.g., 0.1) or as strings which can be assigned numeric values later (e.g., beta). We recommend the latter method for all but the simplest models, since parameters may recur in multiple transitions and so that parameter values may be edited without risk of editing the model structure itself. It also improves readability of the configuration files.

    Parameters can take on three types of values:

    • Fixed values

    • Value drawn from distributions

    • Values read from timeseries specified in a data file

    hashtag
    Specifying fixed parameter values

    Parameters can be assigned values by using the value argument after their name and then simply stating their numeric argument. For example, in a config describing a simple SIR model with transmission rate (beta) = 0.1/day and recovery rate (gamma) = 0.2/day. This could be specified as

    The full model section of the config could then read

    For the stratified SI model described , this portion of the config would read

    If there are no parameter values that need to be specified (all rates given numeric values when defining model transitions), the seir::parameters section of the config can be left blank or omitted.

    hashtag
    Specifying parameters values from distributions

    Parameter values can also be specified as random values drawn from a distribution, as a way of including uncertainty in parameters in the model output. In this case, every time the model is run independently, a new random value of the parameter is drawn. For example, to choose the same value of beta = 0.1 each time the model is run but to choose a random values of gamma with mean on a log scale of and standard deviation on a log scale of (e.g., 1.2-fold variation):

    Details on the possible distributions that are currently available, and how to specify their parameters, is provided in the .

    Note that understanding when a new parameter values from this distribution is drawn becomes more complicated when the model is run in mode. In Inference mode, we distinguish model runs as occurring in different "slots" – i.e., completely independent model instances that could be run on different processing cores in a parallel computing environment – and different "iterations" of the model that occur sequentially when the model is being fit to data and update fitted parameters each time based on the fit quality found in the previous iteration. A new parameter values is only drawn from the above distribution once per slot. Within a slot, at each iteration during an inference run, the parameter is only changed if it is being fit and the inference algorithm decides to perturb it to test a possible improved fit. Otherwise, it would maintain the same value no matter how many times the model was run within a slot.

    hashtag
    Specifying parameter values as timeseries from data files

    Sometimes, we want to be able to specify model parameters that have different values at different timepoints. For example, the relative transmissibility may vary throughout the year based on the weather conditions, or the rate at which individuals are vaccinated may vary as vaccine programs are rolled out. One way to do this is to instead specify the parameter values as a timeseries.

    This can be done by providing a data file in .csv or .parquet format that has a list of values of the parameter for a corresponding timepoint and subpopulation name. One column should be date, which should have an entry for every calendar day of the simulation, with the first and last date corresponding to the start_date and end_date for the simulation specified in the header of the config. There should be another column for each subpopulation, where the column name is the subpop name used in other files and the values are the desired parameter values for that subpopulation for the corresponding day. If any day or subpopulation is missing, an error will occur. However, if you want all subpopulations to have the same parameter value for every day, then only a single column in addition to date is needed, which can have any name, and will be applied to every subpop ;

    For example, for an SIR model with a simple where the relative transmissibility peaks on January 1 then decreases linearly to a minimal value on June 1 then increases linearly again, but varies more in the small province than the large province, the theta parameter could be constructed from the file seasonal_transmission_2pop.csv with contents including

    as a part of a configuration file with the model sections:

    Note that there is an alternative way to specify time dependence in parameter values that is described in the section. That method allows the user to define intervention parameters that apply specific additive or multiplicative shifts to other parameter values for a defined time interval. Interventions are useful if the parameter doesn't vary frequently and if the values of the shift is unknown and it is desired to either sample over uncertainty in it or try to estimate its value by fitting the model to data. If the parameter varies frequently and its value or relative value over time is known, specifying it as a timeseries is more efficient.

    Compartmental model parameters can have an additional attribute beyond value or timeseries, which is called stacked_modifier_method. This value is explained in the section on coding (also known as "modifiers") as it determines what happens when two different modifiers act on the same parameter at the same time (are they combined additively or multiplicatively?) ;

    Config item
    Required?
    Type/Format
    Description

    hashtag
    Specifying model simulation method (seir::integration)

    A compartmental model defined using the notation in the previous sections describes rules for classifying individuals in the population based on infection state dynamically, but does not uniquely specify the mathematical framework that should be used to simulate the model.

    Our framework allows for two major methods for implementing compartmental models of disease transmission:

    • ordinary differential equations, which are completely deterministic, operate in continuous time (consider infinitesimally small timesteps), and allow for arbitrary fractions of the population (i.e., not just discrete individuals) to move between model compartments

    • discrete-time stochastic process, which tracks discrete individuals and produces random variation in the number of individuals transitioning between states for any given rate, and which allows transitions between states only to occur at discrete time intervals

    The mathematics behind each implementation is described in the section

    Config item
    Required?
    Type/format
    Description

    For example, to simulate a model deterministically using the 4th order Runge-Kutta algorithm for numerical integration with a timestep of 1 day:

    Alternatively, to simulate a model stochastically with a timestep of 0.1 days

    For any method, the results of the model will be more accurate when the timestep is smaller (i.e., output will more precisely match the mathematics of the model description and be invariant to the choice of timestep). However, the computing time required to simulate the model for a certain time range of interest increases with the number of timesteps required (i.e., with smaller timesteps). In our experience, the 4th order Runge-Kutta algorithm (for details see section) is a very accurate method of numerically integrating such models and can handle timesteps as large as roughly a day for models with the maximum per capita transition rates in this same order of magnitude. However, both of the discrete time engines require smaller timesteps to be accurate (around 0.1 for COVID-19-like dynamics in our experience).

    name: sir
    setup_name: minimal
    start_date: 2020-01-31
    end_date: 2020-05-31
    nslots: 1
    
    subpop_setup:
      geodata: geodata_sample_1pop.csv
      mobility: mobility_sample_1pop.csv
      popnodes: population
      nodenames: name
    
    seeding:
      method: FromFile
      seeding_file: data/seeding_1pop.csv
    
    compartments:
      infection_stage: ["S", "I", "R"]
    
    seir:
      integration:
        method: stochastic
        dt: 1 / 10
      parameters:
        gamma:
          value:
            distribution: fixed
            value: 1 / 5
        Ro:
          value:
            distribution: uniform
            low: 2
            high: 3
      transitions:
        - source: ["S"]
          destination: ["I"]
          rate: ["Ro * gamma"]
          proportional_to: [["S"],["I"]]
          proportion_exponent: ["1","1"]
        - source: ["I"]
          destination: ["R"]
          rate: ["gamma"]
          proportional_to: ["I"]
          proportion_exponent: ["1"]
    
    interventions:
      scenarios:
        - None
        - Lockdown
      modifiers:
        None:
          method: SinglePeriodModifier
          parameter: r0
          period_start_date: 2020-04-01
          period_end_date: 2020-05-15
          value:
            distribution: fixed
            value: 0
            settings:
        Lockdown:
          method: SinglePeriodModifier
          parameter: r0
          period_start_date: 2020-04-01
          period_end_date: 2020-05-15
          value:
            distribution: fixed
            value: 0.7
    > flepimop simulate sir_control.yml
    /> flepimop simulate -n 100 -j 4 -npi_scenario None -m euler --write_csv sir_control.yml
    subpop_setup:
      census_year: 2010
      state_level: TRUE
      geodata: geodata_2019_statelevel.csv
      mobility: mobility_2011-2015_statelevel.csv
      modeled_states:
        - CT
        - MA
        - ME
        - NH
        - RI
        - VT
      
    USPS	subpop	population
    AL	01000	4876250
    AK	02000	737068
    AZ	04000	7050299
    AR	05000	2999370
    CA	06000	39283497
    .....
    ori	dest	amount
    01000	02000	198
    01000	04000	292
    01000	05000	570
    01000	06000	1030
    01000	08000	328
    .....
    importation:
      census_api_key: "fakeapikey00000"
      travel_dispersion: 3
      maximum_destinations: Inf
      dest_type: state
      dest_county: USA
      aggregate_to: airport
      cache_work: TRUE
      update_case_data: TRUE
      draw_travel_from_distribution: FALSE
      print_progress: FALSE
      travelers_threshold: 10000
      airport_cluster_distance: 80
      param_list:
        incub_mean_log: log(5.89)
        incub_sd_log: log(1.74)
        inf_period_nohosp_mean: 15
        inf_period_nohosp_sd: 5
        inf_period_hosp_mean_log: 1.23
        inf_period_hosp_sd_log: 0.79
        p_report_source: [0.05, 0.25]
        shift_incid_days: -10
        delta: 1
    report:
      data_settings:
        pop_year: 2018
      plot_settings:
        plot_intervention: TRUE
      formatting:
        scenario_labels_short: ["UC", "S1"]
        scenario_labels:
          - Uncontrolled
          - Scenario 1
        scenario_colors: ["#D95F02", "#1B9E77"]
        pdeath_labels: ["0.25% IFR", "0.5% IFR", "1% IFR"]
        display_dates: ["2020-04-15", "2020-05-01", "2020-05-15", "2020-06-01", "2020-06-15"]
        display_dates2: ["2020-04-15", "2020-05-15", "2020-06-15"]
    ≥\geq≥
    ≥\geq≥
    parquet filesarrow-up-right

    Code structure

    – an integer value for the amount of individuals who transition between states in the seeding event
  • source_* and destination_* – For each compartment group (i.e., infection stage, vaccination stage, age group), a different column describes the status of individuals before and after the transition described by the seeding event. For example, for a model where individuals are stratified by age and vaccination status, and a 1-day vaccination campaign for young children and the elderly moves a large number of individuals into a vaccinated state, this file could be something like

  • initial conditions

    duration

    No

    value or distribution

    The duration of time an individual remains counted within the named outcome variablet

    sum

    No

    List

    A list of other outcome variables to sum into the current outcome variable

    value: The amount by which the baseline value will be multiplied, for example, 0.75 or 1.1

    source

    Yes

    Varies

    The infection model variable or outcome variable from which the named outcome variable is created

    probability

    Yes, unless sum option is used instead

    value or distribution

    The probability that an individual in the source variable appears in the named outcome variable

    delay

    Yes, unless sum option is used instead

    value or distribution

    outcomes output file
    distribution
    time-dependent parameter modifications
    compartment model parameters
    Inference
    sum
    sum
    outcome_modifiers
    outcomes_modifier
    sum
    outcomes::outcomes
    geodata

    The time delay between individual's appearance in source variable and appearance in named outcome variable

    proportional_to

  • proportion_exponent

  • rolling_mean_windows

    optional

    integer

    The size of the rolling mean window if a rolling mean is applied.

    γ\gammaγ
    δSunvaccinatedδt=−βSunvaccinated1(Iunvaccinated+Ivaccinated)α\frac{\delta \text{S}_\text{unvaccinated}}{\delta t} = - \beta \text{S}_\text{unvaccinated}^1 (\text{I}_\text{unvaccinated}+\text{I}_\text{vaccinated})^{\alpha}δtδSunvaccinated​​=−βSunvaccinated1​(Iunvaccinated​+Ivaccinated​)α
    δIunvaccinatedδt=βSunvaccinated1(Iunvaccinated+Ivaccinated)α\frac{\delta \text{I}_\text{unvaccinated}}{\delta t} = \beta \text{S}_\text{unvaccinated}^1 (\text{I}_\text{unvaccinated}+\text{I}_\text{vaccinated})^{\alpha}δtδIunvaccinated​​=βSunvaccinated1​(Iunvaccinated​+Ivaccinated​)α
    β\betaβ
    γ\gammaγ
    e−1.6=0.2e^{-1.6} = 0.2e−1.6=0.2
    e0.2=1.2e^{0.2} = 1.2e0.2=1.2

    value

    either value or timeseries is required

    numerical, or distribution

    This defines the value of the parameter, as described above.

    timeseries

    either value or timeseries is required

    path to a csv file

    This defines a timeseries for each day, as above.

    stacked_modifier_method

    optional

    string: sum, product, reduction_product

    method

    optional

    string: rk4 (default),euler, stochastic

    The algorithm used to simulate the model equations. If rk4, model is simulated deterministically by numerical integration using a 4th order Runge-Kutta algorithm. If euler or stochastic, uses a discrete-time process, with steps proceeding either deterministically (at the average rate) or stochastically. For both of these cases, the algorithm ensures no compartment goes below zero for the requested time step. The -(-m)ethod option can be used (see Other Configuration Options) to override this configuration option.

    dt

    optional

    positive real number (default: 2)

    The timestep used for the numerical integration or discrete time stochastic update; for rk4 method, this is a reasonable value, but for other options, this should be 0.2 or less.

    seir
    outcomes
    Model Description section
    Specifying compartmental model parameters
    above
    Distributions section
    Inference
    two-province population structure
    Specifying time-varying parameter modifications
    time-dependent parameter modifications
    Model Description
    Advanced
    A stratified SI model including vaccination

    This option defines the method used when modifiers are applied. The default is product.

    seeding:
        method: “NoSeeding”
    seeding:
      method: "FromFile"
      seeding_file: seeding_2pop.csv
    subpop, date, amount, source_infection_stage, destination_infection_stage
    small_province, 2020-02-01, 5, S, E
    subpop, date, amount, source_infection_stage, source_vaccine_doses, source_age_group, destination_infection_stage, destination_vaccine_doses, destination_age_group
    anytown, 1950-03-15, 452, S, 0dose, under5years, S, 1dose, under5years
    anytown, 1950-03-16, 527, S, 0dose, 5_10years, S, 1dose, 5_10years
    anytown, 1950-03-17, 1153, S, 0dose, over65years, S, 1dose, over65years
    seeding:
      method: "PoissonDistributed"
      lambda_file: seeding.csv
    seeding:
      method: "NegativeBinomialDistributed"
      lambda_file: seeding.csv
    compartments:
      infection_stage: ["S", "I", "R"]
      
    seir:
      transitions:
        # infection
        - source: [S]
          destination: [I]
          proportional_to: [[S], [I]]
          rate: [beta]
          proportion_exponent: 1
        # recovery
        - source: [I]
          destination: [R]
          proportional_to: [[I]]
          rate: [gamma]
          proportion_exponent: 1
      parameters:
        beta: 
          value: 0.2
        gamma: 
          value: 0.1
    
    outcomes:
      settings:
        method: delayframe
      outcomes:
        incidC:
          source:
            incidence:
              infection_stage: "I"
          probability: 
            value: 0.5
          delay: 
            value: 2
        incidH:
          source:
            incidence:
              infection_stage: "I"
          probability: 
            value: 0.01
          delay: 
            value: 21 
    // Some code
     compartments:
       infection_state: ["S", "I", "R"]
       age_group: ["child", "adult"]
       vaccination_status: ["unvaxxed", "vaxxed"]
       
    outcomes:
      incidH_child:
        source:
          incidence:
            infection_state: "I"
            age_group: "child"
        ...
      incidH_adult:
        source:
          incidence:
            infection_state: "I"
            age_group: "adult"
        ...
      incidH_all:
        source:
          incidence:
            infection_state: "I"
        ...
     compartments:
       infection_state: ["S", "I", "R"]
       age_group: ["child", "adult"]
       vaccination_status: ["unvaxxed", "vaxxed"]
       
    outcomes:
      incidC:
        source:
          prevalence:
            infection_state: "I"
        ...
    outcomes:
      incidC:
        source:
          prevalence:
            infection_state: "I"
        ...
      incidT:
        source: incidC
        ...
    outcomes:
      incidH_child:
        source:
          incidence:
            infection_state: "I"
            age_group: "child"
        probability: 
          value: 0.05
          modifier_key: hosp_rate
      incidH_adult:
        source:
          incidence:
            infection_state: "I"
            age_group: "adult"
        probability: 
          value: 0.01
          modifier_key: hosp_rate
    outcomes:
      incidC:
        source:
          prevalence:
            infection_state: "I"
        probability:
          value:
            distribution: uniform
            low: 
              value: 0.2
            high: 
              value: 0.3
          intervention_param_name: "case_detect_rate"
    outcomes:
      incidH_child:
        source:
          incidence:
            infection_state: "I"
            age_group: "child"
        probability: 
          value: 0.05
        delay: 
          value: 7
    outcomes:
      incidH_child:
        source:
          incidence:
            infection_state: "I"
            age_group: "child"
        probability: 
          value: 0.05
        delay: 
          value: 
            distribution: truncnorm
            mean: 7
            sd: 2
            a: 0
            b: Inf
    outcomes:
      incidH_child:
        source:
          incidence:
            infection_state: "I"
            age_group: "child"
        probability: 
          value: 0.05
        delay: 
          value: 7
        duration: 
          value: 3
    outcomes:
      incidH_child:
        source:
          incidence:
            infection_state: "I"
            age_group: "child"
        probability: 
          value: 0.05
        delay: 
          value: 7
        duration: 
          value: 3
          name: "hosp_child_curr"
    outcomes:
      incidH_child:
        source:
          incidence:
            infection_state: "I"
            age_group: "child"
        probability: 0.05
        delay: 6
        duration: 
          value: 14
          name: "hosp_child_curr"
      incidH_adult:
        source:
          incidence:
            infection_state: "I"
            age_group: "adult"
        probability: 0.01
        delay: 8
        duration:
          value: 7
          name: "hosp_adult_curr"
      incidH_total: 
        sum: ["incidH_child","incidH_adult"]
      hosp_curr_total:   
        sum: ["hosp_child_curr","hosp_adult_curr"]
    // Some code
    
    compartments:
      infection_stage: ["S", "I", "R"]
      
    seir:
      transitions:
        # infection
        - source: [S]
          destination: [I]
          proportional_to: [[S], [I]]
          rate: [beta]
          proportion_exponent: 1
        # recovery
        - source: [I]
          destination: [R]
          proportional_to: [[I]]
          rate: [gamma]
          proportion_exponent: 1
      parameters:
        beta: 0.1
        gamma: 0.2
      integration:
         method: rk4
         dt: 1.00
    compartments:
      infection_stage: ["S", "I", "R"]
     compartments:
       infection_stage: ["S", "I", "R"]
       vaccination_status: ["unvaccinated", "vaccinated"]
    infection_stage, vaccination_status, compartment_name
    S,               unvaccinated,       S_unvaccinated
    I,               unvaccinated,       I_unvaccinated
    R,               unvaccinated,       R_unvaccinated
    S,               vaccinated,         S_vaccinated
    I,               vaccinated,         I_vaccinated
    R,               vaccinated,         R_vaccinated
     compartments:
       infection_stage: ["S", "I"]
       age_group: ["child", "adult"]
       vaccination_status: ["unvaccinated", "1dose", "2dose"]
    infection_stage, age_group, vaccination_status, compartment_name
    S,		 child,	    unvaccinated,	S_child_unvaccinated	
    I,		 child,	    unvaccinated,	I_child_unvaccinated
    S,		 adult,	    unvaccinated,	S_adult_unvaccinated
    I,		 adult,	    unvaccinated,	I_adult_unvaccinated
    S,		 child,	    1dose,		S_child_1dose
    I,		 child,	    1dose,		I_child_1dose
    S,		 adult,     1dose,		S_adult_1dose
    I,		 adult,     1dose,		I_adult_1dose
    S,		 child,     2dose,		S_child_2dose	
    I,		 child,     2dose,		I_child_2dose
    S,		 adult,	    2dose,		S_adult_2dose
    I,		 adult,	    2dose,		I_adult_2dose
    compartments:
       overall_state: ["S_child", "I_child", "S_adult_unvaccinated", "I_adult_unvaccinated", "S_adult_1dose", "I_adult_1dose", "S_adult_2dose", "I_adult_2dose"]
    [S,unvaccinated]
    [I,unvaccinated]
    5
    beta
    [[[S,unvaccinated]], [[I,unvaccinated], [I, vaccinated]]]
    [[S_unvaccinated], [I_unvaccinated, I_vaccinated]]
    [S_unvaccinated, I_unvaccinated + I_vaccinated]
    [1, 0.9]
    [1, alpha]
    source: [S, unvaccinated]
    destination: [I, unvaccinated]
    proportional_to: [[[S,unvaccinated]], [[I,unvaccinated], [I,vaccinated]]]
    rate: [5]
    proportion_exponent: [1, 0.9]
    [[S], [unvaccinated,vaccinated]]
    [[I], [unvaccinated,vaccinated]]
    [[I], [vaccinated,unvaccinated]]
    rate: [[3], [0.6,0.5]]
    [
      [[S,unvaccinated], [S,vaccinated]],
      [[I,unvaccinated],[I, vaccinated]], [[I,unvaccinated],[I, vaccinated]]
    ]
    [
      [S,unvaccinated],
      [[I,unvaccinated],[I, vaccinated]]
    ]
    [
      [S,vaccinated],
      [[I,unvaccinated],[I, vaccinated]]
    ]
    [
      [[S,unvaccinated], [S,vaccinated]],
      [[I,unvaccinated],[I, vaccinated]], [[I, vaccinated]]
    ]
    [[1,1], [0.9,0.8]]
    seir:
      transitions:
        source: [[S],[unvaccinated,vaccinated]]
        destination: [[I],[unvaccinated,vaccinated]]
        proportional_to: [
                           [[S,unvaccinated], [S,vaccinated]],
                           [[I,unvaccinated],[I, vaccinated]], [[I, vaccinated]]
                         ]
        rate: [[3], [0.6,0.5]]
        proportion_exponent: [[1,1], [0.9,0.8]]
    seir:
      transitions:
        - source: [S,unvaccinated]
          destination: [I,unvaccinated]
          proportional_to: [[[S,unvaccinated]], [[I,unvaccinated],[I, vaccinated]]]
          proportion_exponent: [1 * 0.9]
          rate: [3*0.6]
        - source: [S,vaccinated]
          destination: [I,vaccinated]
          proportional_to: [[[S,vaccinated]], [[I, vaccinated]]]
          proportion_exponent: [1 * 0.8]
          rate: [3*0.5]
    seir:
      parameters:
        beta: 
          value: 0.1
        gamma: 
          value: 0.2
    compartments:
      infection_state: ["S", "I", "R"]
      
    seir:
      transitions:
        # infection
        - source: [S]
          destination: [I]
          proportional_to: [[S], [I]]
          rate: [beta]
          proportion_exponent: 1
        # recovery
        - source: [I]
          destination: [R]
          proportional_to: [[I]]
          rate: [gamma]
          proportion_exponent: [1,1]
      parameters:
        beta: 
          value: 0.1
        gamma: 
          value: 0.2
    compartments:
      infection_stage: ["S", "I", "R"]
      vaccination_status: ["unvaccinated", "vaccinated"]
      
    seir:
      transitions:
        source: [[S],[unvaccinated,vaccinated]]
        destination: [[I],[unvaccinated,vaccinated]]
        proportional_to: [
                           [[S,unvaccinated], [S,vaccinated]],
                           [[I,unvaccinated],[I, vaccinated]], [[I, vaccinated]]
                         ]
        rate: [[beta], [theta_u,theta_v]]
        proportion_exponent: [[1,1], [alpha_u,alpha_v]]
      parameters:
        beta: 
          value: 0.1
        theta_u: 
          value: 0.6
        theta_v: 
          value: 0.5
        alpha_u: 
          value: 0.9
        alpha_v: 
          value: 0.8
    seir:
      parameters:
        beta: 
          value:
            distribution: fixed
            value: 0.1
        gamma: 
          value:
            distribution: lognorm
            logmean: -1.6
            logsd: 0.2
    date,        small_province,    large_province
    2022-01-01,  1.5,               1.3
    .....
    2022-05-01,  0.5,               0.7 
    ....
    2022-12-31,  1.5,               1.3
    compartments:
      infection_stage: ["S", "I", "R"]
    
    seir:
      transitions:
        # infection
        - source: [S]
          destination: [I]
          proportional_to: [[S], [I]]
          rate: [beta*theta]
          proportion_exponent: 1
        # recovery
        - source: [I]
          destination: [R]
          proportional_to: [[I]]
          rate: [gamma]
          proportion_exponent: 1
      parameters:
        beta: 
          value: 0.1
        gamma: 
          value: 0.2
        theta:
           timeseries: data/seasonal_transmission.csv
    seir:
      integration:
         method: rk4
         dt: 1.00
    seir:
      integration:
         method: stochastic
         dt: 0.1

    Specifying time-varying parameter modifications

    This section describes how to specify modifications to any of the parameters of the transmission model or observational model during certain time periods.

    Modifiers are a powerful feature in flepiMoP to enable users to modify any of the parameters being specified in the model during particular time periods. They can be used, for example, to mirror public health control interventions, like non-pharmaceutical interventions (NPIs) or increased access to diagnosis or care, or annual seasonal variations in disease parameters. Modifiers can act on any of the transmission model parameters or observation model parameters ;

    In the seir_modifiers and outcome_modifiers sections of the configuration file the user can specify several possible types of modifiers which will then be implemented in the model. Each modifier changes a parameter during one or multiple time periods and for one or multiple specified subpopulations.

    We currently support the following intervention types. Each of these is described in detail below:

    • "SinglePeriodModifier" – Modifies a parameter during a single time period

    • "MultiPeriodModifier" – Modifies a parameter by the same amount during a multiple time periods

    • "ModifierModifier" – Modifies another intervention during a single time period

    • "StackedModifier" – Combines two or more interventions additively or multiplicatively, and is used to be able to turn on and off groups of interventions easily for different runs ;

    circle-info

    Note that if you want a parameter to vary continuously over time (for example, a daily transmission rate that is influenced by temperature and humidity), then it is easier to do this by using a "timeseries" parameter value than by combining many separate modifiers. Timeseries parameter values are described in the section. Timeseries parameters for parameters (e.g., a testing rate that fluctuates rapidly due to test availability) are in development but not currently available ;

    Within flepiMoP, modifiers can be run as "scenarios". With scenarios, we can use the same configuration file to run multiple versions of the model where only the modifiers applied differ.

    The modifiers section contains two sub-sections: modifiers::scenarios, which lists the name of the modifiers that will run in each separate scenario, and modifiers::modifiers, where the details of each modifier are specified (e.g., the parameter it acts on, the time it is active, and the subpopulation it is applied to). An example is outlined below

    In this example, each scenario runs a single intervention, but more complicated examples are possible. ;

    The major benefit of specifying both "scenarios" and "modifiers" is that the user can use "StackedModifier" option to combine other modifiers in different ways, and then run either the individual or combined modifiers as scenarios. This way, each scenario may consist of one or more individual parameter modifications, and each modification may be part of multiple scenarios. This provides a shorthand to quickly consider multiple different versions of a model that have different combinations of parameter modifications occurring. For example, during an outbreak we could evaluate the impact of school closures, case isolation, and masking, or any one or two of these three measures. An example of a configuration file combining modifiers to create new scenarios is given below

    circle-info

    The seir_modifiers::scenarios andoutcome_modifiers::scenarios sections are optional. If the scenariossection is not included, the model will run with all of the modifiers turned "on" ;

    circle-info

    If thescenariossection is included for either seir or outcomes, then each time a configuration file is run, the user much specify which modifier scenarios will be run. If not specified, the model will be run one time for each combination of seir and outcome scenario ;

    hashtag
    Example

    [Give a configuration file that tries to use all the possible option available. Based on simple SIR model with parameters beta and gamma in 2 subpopulations. Maybe a SinglePeriodModifier on beta for a lockdown and gamma for isolation, one having a fixed value and one from a distribution, MultiPeriodModifier for school year in different places, ModifierModifer for ..., StackedModifier for .... ]

    hashtag
    modifiers::scenarios

    A optional list consisting of a subset of the modifiers that are described in modifiers::settings, each of which will be run as a separate scenario. For example

    or

    hashtag
    modifiers::settings

    A formatted list consisting of the description of each modifier, including its name, the parameter it acts on, the duration and amount of the change to that parameter, and the subset of subpopulations in which the parameter modification takes place. The list items are summarized in the table below and detailed in the sections below.

    Config item
    Required
    Type/format
    Description

    hashtag
    SinglePeriodModifier

    SinglePeriodModifier interventions enable the user to specify a multiplicative reduction to a parameter of interest. It take a parameter, and reduces it's value by value (new = (1-value) * old) for the subpopulations listed insubpop during the time interval [period_start_date, period_end_date]

    For example, if you would like to create an SEIR modifier called lockdown that reduces transmission by 70% in the state of California and the District of Columbia between two dates, you could specify this with a SinglePeriodModifier, as in the example below

    hashtag
    Example

    Or to create an outcome variable modifier called enhanced_testing during which the case detection rate double ;

    hashtag
    Configuration options

    method: SinglePeriodModifier

    parameter: The name of the parameter that will be modified. This could be a parameter defined for the transmission model in or for the observational model in . If the parameter is used in multiple transitions in the model then all those transitions will be modified by this amount ;

    period_start_date: The date when the modification starts, in YYYY-MM-DD format. The modification will only reduce the value of the parameter after (inclusive of) this date.

    period_end_date: The date when the modification ends, in YYYY-MM-DD format. The modification will only reduce the value of the parameter before (inclusive of) this date.

    subpop:A list of subpopulation names/ids in which the specified modification will be applied. This can be a single subpop, a list, or the word "all" (specifying the modification applies to all existing subpopulations in the model). The modification will do nothing for any subpopulations not listed here.

    value:The fractional reduction of the parameter during the time period the modification is active. This can be a scalar number, or a distribution using the notation described in the section. The new parameter value will be

    subpop_groups: An optional list of lists specifying which subsets of subpopulations in subpop should share parameter values; when parameters are drawn from a distribution or fit to data. See section below for more details ;

    hashtag
    MultiPeriodModifier

    MultiPeriodModifier interventions enable the user to specify a multiplicative reduction to the parameter of interest by value (new = (1-value) * old) for the subpopulations listed in subpop during multiple different time intervals each defined by a start_date and end_date.

    For example, if you would like to describe the impact that transmission in schools has on overall disease spread, you could create a modifier that increases transmission by 30% during the dates that K-12 schools are in session in different regions (e.g., Massachusetts and Florida):

    hashtag
    Example

    hashtag
    Configuration options

    method: MultiPeriodModifier

    parameter: The name of the parameter that will be modified. This could be a parameter defined for the transmission model in or for the observational model in . If the parameter is used in multiple transitions in the model then all those transitions will be modified by this amount ;

    groups: A list of subpopulations (subpops) or groups of them, and time periods the modification will be active in each of them

    • groups:subpop A list of subpopulation names/ids in which the specified modification will be applied. This can be a single subpop, a list, or the word "all" (specifying the modification applies to all existing subpopulations in the model). The modification will do nothing for any subpopulations not listed here.

    • groups: periods A list of time periods, each defined by a start and end date, when the modification will be applied

    value:The fractional reduction of the parameter during the time period the modification is active. This can be a scalar number, or a distribution using the notation described in the section. The new parameter value will be

    subpop_groups: An optional list of lists specifying which subsets of subpopulations in subpop should share parameter values; when parameters are drawn from a distribution or fit to data. See section below for more details ;

    hashtag
    ModifierModifier

    ModifierModifier interventions allow the user to specify an intervention that acts to modify the value of another intervention, as opposed to modifying a baseline parameter value. The intervention multiplicatively reduces the modifier of interest by value (new = (1-value) * old) for the subpopulations listed in subpop during the time interval [period_start_date, period_end_date].

    hashtag
    Example

    For example, ModifierModifier could be used to describe a social distancing policy that is in effect between two dates and reduces transmission by 60% if followed by the whole population, but part way through this period, adherence to the policy drops to only 50% of in one of the subpopulations population:

    Note that this configuration is identical to the following alternative specification

    However, there are situations when the ModiferModifier notation is more convenient, especially when doing parameter fitting. ;

    hashtag
    Configuration options

    method: ModifierModifier

    baseline_modifier: The name of the original parameter modification which will be further modified.

    parameter: The name of the parameter in the baseline_scenario that will be modified ;

    period_start_date: The date when the intervention modifier starts, in YYYY-MM-DD format. The intervention modifier will only reduce the value of the other intervention after (inclusive of) this date.

    period_end_date: The date when the intervention modifier ends, in YYYY-MM-DD format. The intervention modifier will only reduce the value of the other intervention before (inclusive of) this date.

    subpop:A list of subpopulation names/ids in which the specified intervention modifier will be applied. This can be a single subpop, a list, or the word "all" (specifying the interventions applies to all existing subpopulations in the model). The intervention will do nothing for any subpopulations not listed here.

    value:The fractional reduction of the baseline intervention during the time period the modifier intervention is active. This can be a scalar number, or a distribution using the notation described in the section. The new parameter value will be

    and so the value of the underlying parameter that was modified by the baseline intervention will be

    subpop_groups: An optional list of lists specifying which subsets of subpopulations in subpop should share parameter values; when parameters are drawn from a distribution or fit to data. See section below for more details ;

    hashtag
    StackedModifier

    Combine two or more modifiers into a scenario, so that they can easily be singled out to be run together without the other modifiers. If multiply modifiers act during the same time period in the same subpopulation, their effects are combined multiplicatively. Modifiers of different types (i.e. SinglePeriodModifier, MultiPeriodModifier, ModifierModifier, other StackedModifiers) can be combined ;

    hashtag
    Examples

    or

    hashtag
    Configuration options

    method: StackedModifier

    modifiers: A list of names of the other modifiers (specified above) that will be combined to create the new modifier (which we typically refer to as a "scenario")

    hashtag
    modifiers::modifiers::groups

    subpop_groups: For any of the modifier types, subpop_groups is an optional list of lists specifying which subsets of subpopulations in subpop should share parameter values; when parameters are drawn from a distribution or fit to data. All other subpopulations not listed will have unique intervention values unlinked to other areas. If the value is 'all', then all subpopulations will be assumed to have the same modifier value. When the subpop_groups option is not specified, all subpopulations will be assumed to have unique values of the modifier ;

    For example, for a model of disease spread in Canada where we want to specify that the (to be varied) value of a modification to the transmission rate should be the same in all the Atlantic provinces (Nova Scotia, Newfoundland, Prince Edward Island, and New Brunswick), the same in all the prairie provinces (Manitoba, Saskatchewan, Alberta), the same in the three territories (Nunavut, Northwest Territories, and Yukon), and yet take unique values in Ontario, Quebec, and British Columbia, we could write

    Distributions

    This page describes the configuration schema for specifying distributions

    Distribution
    Parameters
    Type/Format
    Description

    fixed

    value

    Any real number

    Draws all values exactly equal to value

    uniform

    low

    Any real number

    Draws all values randomly from a uniform distribution with range [low, high]

    high

    Any real number greater than low

    poisson

    lam

    Any positive real number

    Draws all values randomly from a Poisson distribution with rate parameter (mean) lam (lambda)

    binomial

    size

    Any non-negative integer

    Draws all values randomly from a binomial distribution with number of trials (n) = size and probability of success on each trial (p) = prob

    prob

    Any number in [0,1]

    lognormal

    meanlog

    Any real number

    Draws all values randomly from a lognormal distribution (natural log, base e) with mean on a log scale of meanlog and standard deviation on a log scale of sdlog

    sdlog

    Any non-negative real number

    truncnorm

    mean

    Any real number

    Draws all values randomly from a truncated normal distribution with mean mean and standard deviation sd, truncated to have a maximum value of a and a minimum value of b

    sd

    Any non-negative real number

    a

    Any real number, or -Inf

    b

    Any real number greater than a, or Inf

    period_end_date or periods::end_date

    required

    numeric, YYYY-MM-DD

    The date when the modification ends. Notation depends on value of method.

    subpop

    required

    String, or list of strings

    The subpopulations to which the modifications will be applied, or "all" . Subpopulations must appear in the geodata file.

    value

    required

    Distribution, or single value

    The relative amount by which a modification reduces the value of a parameter.

    subpop_groups

    optional

    string or a list of lists of strings

    A list of lists defining groupings of subpopulations, which defines how modification values should be shared between them, or 'all' in which case all subpopulations are put into one group with identical modification values. By default, if parameters are chosen randomly from a distribution or fit based on data, they can have unique values in each subpopulation.

    baseline_scenario

    Used only for ModifierModifier

    String

    Name of the original modification which will be further modified

    modifiers

    Used only for StackedModifier

    List of strings

    List of modifier names to be grouped into the new combined modifier/scenario name

    groups:periods:start_date The date when the modification starts, in YYYY-MM-DD format. The modification will only reduce the value of the parameter after (inclusive of) this date.

  • groups:periods:end_date The date when the modification ends, in YYYY-MM-DD format. The modification will only reduce the value of the parameter before (inclusive of) this date.

  • method

    required

    string

    one of SinglePeriodModifier, MultiPeriodModifier, ModifierModifier, or StackedModifier

    parameter

    required

    string

    The parameter on which the modification is acting. Must be a parameter defined in seir::parameters or outcomes

    period_start_date or periods::start_date

    required

    numeric, YYYY-MM-DD

    seir::parameters
    outcomes
    seir::parameters
    outcomes
    Distributions
    subpop_groups
    seir::parameters
    outcomes
    Distributions
    subpop_groups
    Distributions
    subpop_groups

    The date when the modification starts. Notation depends on value of method.

    seir_modifiers:
      scenarios:
        -NameOfIntervention1
        -NameofIntervention2
      modifiers:
        NameOfIntervention1:
          ...
        NameOfIntervention2:
          ...
    seir_modifiers:
      scenarios:
        -SchoolClosures
        -AllNPIs
      modifiers:
        SchoolClosures:
          method:SinglePeriodModifier
          ...
        CaseIsolation:
          method:SinglePeriodModifier
          ...
        Masking:
          method:SinglePeriodModifier
          ....
        AllNPIs
          method: StackedModifier
          modifiers: ["SchoolClosures","CaseIsolation","Masking"]
    seir_modifiers:
      scenarios:
        -SchoolClosures
        -AllNPIs
    outcome_modifiers
      scenarios:
        -BaselineTesting
        -TestShortage
    seir_modifiers:
      modifiers:
        lockdown: 
          method: SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-03-15
          period_end_date: 2020-05-01
          subpop: ['06000', '11000']
          value: 0.7
    outcome_modifiers:
      modifiers:
        enhanced_testing: 
          method: SinglePeriodModifier
          parameter: incidC::probability
          period_start_date: 2020-03-15
          period_end_date: 2020-05-01
          subpop: ['06000', '11000']
          value: -1.0
    new_parameter_value = old_parameter_value * (1 - value)
    school_year:
      method: MultiPeriodModifier
      parameter: beta
      groups:
        - subpop: ["25000"] 
          periods:
            - start_date: 2021-09-09
              end_date: 2021-12-23
            - start_date: 2022-01-04
              end_date: 2022-06-22
        - subpop: ["12000"]
          periods:
            - start_date: 2021-08-10
              end_date: 2021-12-17
            - start_date: 2022-01-04
              end_date: 2022-05-27
      value: -0.3
    new_parameter_value = old_parameter_value * (1 - value)
    seir_modifiers:
      modifiers:
        social_distancing: 
          method: SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-03-15
          period_end_date: 2020-06-30
          subpop: ['all']
          value: 0.6
        fatigue: 
          method: ModifierModifier
          baseline_scenario: social_distancing
          parameter: beta
          period_start_date: 2020-05-01
          period_end_date: 2020-06-30
          subpop: ['large_province']
          value: 0.5
    seir_modifiers:
      modifiers:
        social_distancing_initial: 
          method: SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-03-15
          period_end_date: 2020-04-31
          subpop: ['all']
          value: 0.6
        social_distancing_fatigue_sp: 
          method: SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-05-01
          period_end_date: 2020-06-30
          subpop: ['small_province']
          value: 0.6
        social_distancing_fatigue_lp: 
          method: SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-05-01
          period_end_date: 2020-06-30
          subpop: ['large_province']
          value: 0.3
    new_intervention_value = old_intervention_value * (1 - value)
    new_parameter_value = original_parameter_value * (1 - baseline_intervention_value * (1 - value) )
    seir_modifiers:
      scenarios:
        -SchoolClosures
        -AllNPIs
      modifiers:
        SchoolClosures:
          method:SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-03-15
          period_end_date: 2020-05-01
          subpop: 'all'
          value: 0.7
        CaseIsolation:
          method:SinglePeriodModifier
          parameter: gamma
          period_start_date: 2020-04-01
          period_end_date: 2020-05-01
          subpop: 'all'
          value: -1.0
        Masking:
          method:SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-04-15
          period_end_date: 2020-05-01
          subpop: 'all'
          value: 0.5
        AllNPIs
          method: StackedModifier
          modifiers: ["SchoolClosures","CaseIsolation","Masking"]
    outcome_modifiers:
      scenarios:
        - ReducedTesting
        - AllDelays
      modifiers:
        DelayedTesting
          method:SinglePeriodModifier
          parameter: incidC::probability
          period_start_date: 2020-03-15
          period_end_date: 2020-05-01
          subpop: 'all'
          value: 0.5
        DelayedHosp
          method:SinglePeriodModifier
          parameter: incidD::delay
          period_start_date: 2020-04-01
          period_end_date: 2020-05-01
          subpop: 'all'
          value: -1.0
        LongerHospStay
          method:SinglePeriodModifier
          parameter: incidH::duration
          period_start_date: 2020-04-15
          period_end_date: 2020-05-01
          subpop: 'all'
          value: -0.5
    seir_modifiers:
      modifiers:
        lockdown: 
          method: SinglePeriodModifier
          parameter: beta
          period_start_date: 2020-03-15
          period_end_date: 2020-05-01
          subpop: 'all'
          subpop_groups: [['NS','NB','PE','NF'],['MB','SK','AB'],['NV','NW','YK']]
          value: 
            distribution: uniform
            low: 0.3
            high: 0.7