Other configuration options

Command line inputs

flepiMoP allows some input parameters/options to be specified in the command line at the time of model submission, in addition to or instead of in the configuration file. This can be helpful for users who want to quickly run different versions of the model ā€“ typically a different number of simulations or a different intervention scenario from among all those specified in the config ā€“ without having to edit or create a new configuration file every time. In addition, some arguments can only be specified via the command line.

In addition to the configuration file and the command line, the inputs described below can also be specified as environmental variables.

In all cases, command line arguments override configuration file entries which override environmental variables. The order of command line arguments does not matter.

Details on how to run the model, including how to add command line arguments or environmental variables, are in the section How to Run.

Command-line only inputs

ArgumentEnv. VariableValue typeDescriptionRequired?Default

-c or --config

CONFIG_PATH

file path

Name of configuration file. Must be located in the current working directory, or else relative or absolute file path must be provided.

Yes

NA

-i or --first_sim_index

FIRST_SIM_INDEX

The index of the first simulation

No

1

-j or --jobs

FLEPI_NJOBS

Number of parallel processors used to run the simulation. If there are more slots that jobs, slots will be divided up between processors and run in series on each.

No

Number of processors on the computer used to run the simulation

--interactiveor --batch

NA

Choose either option

Run simulation in interactive or batch mode

No

batch

--write-csv or --no-write-csv

NA

Choose either option

Whether model output will be saved as .csv files

No

no_write_csv

--write-parquet or --no-write-parquet

NA

Choose either option

Whether model output will be saved as .parquet files (a compressed representation that can be opened and manipulated with minimal memory. May be required for large simulations). Read more about parquet files.

No

write_parquet

Command-line versions of configuration file inputs

ArgumentConfig itemEnv. VariableValue typeDescriptionRequired?Default

-s or --npi_scenario

interventions: scenarios

FLEPI_NPI_SCENARIOS

list of strings

Names of the intervention scenarios described in the config file that will be run. Must be a subset of scenarios defined.

No

All scenarios described in config

-n or --nslots

nslots

FLEPI_NUM_SLOTS

Number of independent simulations of the model to be run

No

Config value

--stochastic or --non-stochastic

seir: integration: method

FLEPI_STOCHASTIC_RUN

choose either option

Whether the model will be run stochastically or non-stochastically (deterministic numerical integration of equations using the RK4 algorithm)

No

Config value

--in-id

FLEPI_RUN_INDEX

string

Unique ID given to the model runs. If the same config is run multiple times, you can avoid the output being overwritten by using unique model run IDs.

No

Constructed from current date and time as YYYY.MM.DD.HH/MM/SS

--out-id

FLEPI_RUN_INDEX

string

Unique ID given to the model runs. If the same config is run multiple times, you can avoid the output being overwritten by using unique model run IDs.

No

Constructed from current date and time as YYYY.MM.DD.HH/MM/SS

Example

As an example, consider running the following configuration file

name: sir
setup_name: minimal
start_date: 2020-01-31
end_date: 2020-05-31
data_path: data
nslots: 1

subpop_setup:
  geodata: geodata_sample_1pop.csv
  mobility: mobility_sample_1pop.csv
  popnodes: population
  nodenames: name

seeding:
  method: FromFile
  seeding_file: data/seeding_1pop.csv

compartments:
  infection_stage: ["S", "I", "R"]

seir:
  integration:
    method: stochastic
    dt: 1 / 10
  parameters:
    gamma:
      value:
        distribution: fixed
        value: 1 / 5
    Ro:
      value:
        distribution: uniform
        low: 2
        high: 3
  transitions:
    - source: ["S"]
      destination: ["I"]
      rate: ["Ro * gamma"]
      proportional_to: [["S"],["I"]]
      proportion_exponent: ["1","1"]
    - source: ["I"]
      destination: ["R"]
      rate: ["gamma"]
      proportional_to: ["I"]
      proportion_exponent: ["1"]

interventions:
  scenarios:
    - None
    - Lockdown
  modifiers:
    None:
      method: SinglePeriodModifier
      parameter: r0
      period_start_date: 2020-04-01
      period_end_date: 2020-05-15
      value:
        distribution: fixed
        value: 0
        settings:
    Lockdown:
      method: SinglePeriodModifier
      parameter: r0
      period_start_date: 2020-04-01
      period_end_date: 2020-05-15
      value:
        distribution: fixed
        value: 0.7

To run this model directly in Python (it can alternatively be run from R, for all details see section How to Run), we could use the command line entry

> gempyor-seir -c sir_control.yml

Alternatively, to run 100 simulations using only 4 of the available processors on our computer, but only running the "" scenario with a deterministic model, and to save the files as .csv (since the model is relatively simple), we could call the model using the command line entry

/> gempyor-seir -c sir_control.yml -n 100 -j 4 -npi_scenario None --non_stochastic --write_csv

Environmental variables

TBA

US-specific configuration file options

Things below here are very out of date. Put here as place holder but not updated recently.

global: smh_round, setup_name, disease

spatial_setup: census_year, modeled_states, state_level

For US-specific population structures

For creating US-based population structures using the helper script build_US_setup.R which is run before the main model simulation script, the following extra parameters can be specified

Config ItemRequired?Type/FormatDescription

census_year

optional

integer (year)

Determines the year for which census population size data is pulled.

state_level

optional

boolean

Determines whether county-level population-size data is instead grouped into state-level data (TRUE). Default FALSE

modeled_states

optional

list of location codes

A vector of locations that will be modeled; others will be ignored

Example 2

To simulate an epidemic across all 50 states of the US or a subset of them, users can take advantage of built in machinery to create geodata and mobility files for the US based on the population size and number of daily commuting trips reported in the US Census.

Before running the simulation, the script build_US_setup.R can be run to get the required population data files from online census data and filter out only states/territories of interest for the model. More details are provided in the How to Run section.

This example simulates COVID-19 in the New England states, assuming no transmission from other states, using 2019 census data for the population sizes and a pre-created file for estimated interstate commutes during the 2011-2015 period.

subpop_setup:
  census_year: 2010
  state_level: TRUE
  geodata: geodata_2019_statelevel.csv
  mobility: mobility_2011-2015_statelevel.csv
  modeled_states:
    - CT
    - MA
    - ME
    - NH
    - RI
    - VT
  

geodata.csv contains

USPS	subpop	population
AL	01000	4876250
AK	02000	737068
AZ	04000	7050299
AR	05000	2999370
CA	06000	39283497
.....

mobility_2011-2015_statelevel.csv contains

ori	dest	amount
01000	02000	198
01000	04000	292
01000	05000	570
01000	06000	1030
01000	08000	328
.....

importation section (optional)

This section is optional. It is used by the covidImportation package to import global air importation data for seeding infections into the United States.

If you wish to include it, here are the options.

Config ItemRequired?Type/FormatDescription

census_api_key

required

string

travel_dispersion

required

number

ow dispersed daily travel data is; default = 3.

maximum_destinations

required

integer

number of airports to limit importation to

dest_type

required

categorical

location type

dest_country

required

string (Country)

ISO3 code for country of importation. Currently only USA is supported

aggregate_to

required

categorical

location type to aggregate to

cache_work

required

boolean

whether to save case data

update_case_data

required

boolean

deprecated; whether to update the case data or used saved

draw_travel_from_distribution

required

boolean

whether to add additional stochasticity to travel data; default is FALSE

print_progress

required

boolean

whether to print progress of importation model simulations

travelers_threshold

required

integer

include airports with at least the travelers_threshold mean daily number of travelers

airport_cluster_distance

required

numeric

cluster airports within airport_cluster_distance km

param_list

required

See section below

see below

importation::param_list

Config ItemRequired?Type/FormatDescription

incub_mean_log

required

numeric

incubation period, log mean

incub_sd_log

required

numeric

incubation period, log standard deviation

inf_period_nohosp_mean

required

numeric

infectious period, non-hospitalized, mean

inf_period_nohosp_sd

required

numeric

infectious period, non-hospitalized, sd

inf_period_hosp_mean_log

required

numeric

infectious period, hospitalized, log-normal mean

inf_period_hosp_sd_log

required

numeric

infectious period, hospitalized, log-normal sd

p_report_source

required

numeric

reporting probability, Hubei and elsewhere

shift_incid_days

required

numeric

mean delay from infection to reporting of cases; default = -10

delta

required

numeric

days per estimations period

importation:
  census_api_key: "fakeapikey00000"
  travel_dispersion: 3
  maximum_destinations: Inf
  dest_type: state
  dest_county: USA
  aggregate_to: airport
  cache_work: TRUE
  update_case_data: TRUE
  draw_travel_from_distribution: FALSE
  print_progress: FALSE
  travelers_threshold: 10000
  airport_cluster_distance: 80
  param_list:
    incub_mean_log: log(5.89)
    incub_sd_log: log(1.74)
    inf_period_nohosp_mean: 15
    inf_period_nohosp_sd: 5
    inf_period_hosp_mean_log: 1.23
    inf_period_hosp_sd_log: 0.79
    p_report_source: [0.05, 0.25]
    shift_incid_days: -10
    delta: 1

report section

The report section is completely optional and provides settings for making an R Markdown report. For an example of a report, see the Supplementary Material of our preprint

If you wish to include it, here are the options.

Config ItemRequired?Type/FormatDescription

data_settings::pop_year

integer

plot_settings::plot_intervention

boolean

formatting::scenario_labels_short

list of strings; one for each scenario in interventions::scenarios

formatting::scenario_labels

list of strings; one for each scenario in interventions::scenarios

formatting::scenario_colors

list of strings; one for each scenario in interventions::scenarios

formatting::pdeath_labels

list of strings

formatting::display_dates

list of dates

formatting::display_dates2

optional

list of dates

a 2nd string of display dates that can optionally be supplied to specific report functions

report:
  data_settings:
    pop_year: 2018
  plot_settings:
    plot_intervention: TRUE
  formatting:
    scenario_labels_short: ["UC", "S1"]
    scenario_labels:
      - Uncontrolled
      - Scenario 1
    scenario_colors: ["#D95F02", "#1B9E77"]
    pdeath_labels: ["0.25% IFR", "0.5% IFR", "1% IFR"]
    display_dates: ["2020-04-15", "2020-05-01", "2020-05-15", "2020-06-01", "2020-06-15"]
    display_dates2: ["2020-04-15", "2020-05-15", "2020-06-15"]

Last updated