(OLD) Configuration options

filtering section

The filtering section configures the settings for the inference algorithm. The below example shows the settings for some typical default settings, where the model is calibrated to the weekly incident deaths and weekly incident confirmed cases for each subpop. Statistics, hierarchical_stats_geo, and priors each have scenario names (e.g., sum_deaths, local_var_hierarchy, and local_var_prior, respectively).

filtering:
  simulations_per_slot: 350
  do_filtering: TRUE
  data_path: data/observed_data.csv
  likelihood_directory: importation/likelihood/
  statistics:
    sum_deaths:
      name: sum_deaths
      aggregator: sum ## function applied over the period
      period: "1 weeks"
      sim_var: incidD
      data_var: death_incid
      remove_na: TRUE
      add_one: FALSE
      likelihood:
        dist: sqrtnorm
        param: [.1]
    sum_confirmed:
      name: sum_confirmed
      aggregator: sum
      period: "1 weeks"
      sim_var: incidC
      data_var: confirmed_incid
      remove_na: TRUE
      add_one: FALSE
      likelihood:
        dist: sqrtnorm
        param: [.2]
  hierarchical_stats_geo:
    local_var_hierarchy:
      name: local_variance
      module: seir
      geo_group_col: USPS
      transform: none
    local_conf:
      name: probability_incidI_incidC
      module: hospitalization
      geo_group_col: USPS
      transform: logit
  priors:
    local_var_prior:
      name: local_variance
      module: seir
      likelihood:
        dist: normal
        param:
        - 0
        - 1

filtering settings

With inference model runs, the number of simulations nsimulations refers to the number of final model simulations that will be produced. The filtering$simulations_per_slot setting refers to the number of iterative simulations that will be run in order to produce a single final simulation (i.e., number of simulations in a single MCMC chain).

ItemRequired?Type/Format

simulations_per_slot

required

number of iterations in a single MCMC inference chain

do_filtering

required

TRUE if inference should be performed

data_path

required

file path where observed data are saved

likelihood_directory

required

folder path where likelihood evaluations will be stored as the inference algorithm runs

statistics

required

specifies which data will be used to calibrate the model. see filtering::statistics for details

hierarchical_stats_geo

optional

specifies whether a hierarchical structure should be applied to any inferred parameters. See filtering::hierarchical_stats_geo for details.

priors

optional

specifies prior distributions on inferred parameters. See filtering::priors for details

filtering::statistics

The statistics specified here are used to calibrate the model to empirical data. If multiple statistics are specified, this inference is performed jointly and they are weighted in the likelihood according to the number of data points and the variance of the proposal distribution.

ItemRequired?Type/Format

name

required

name of statistic, user defined

aggregator

required

function used to aggregate data over the period, usually sum or mean

period

required

duration over which data should be aggregated prior to use in the likelihood, may be specified in any number of days, weeks, months

sim_var

required

column name where model data can be found, from the hospitalization outcomes files

data_var

required

column where data can be found in data_path file

remove_na

required

logical

add_one

required

logical, TRUE if evaluating the log likelihood

likelihood::dist

required

distribution of the likelihood

likelihood::param

required

parameter value(s) for the likelihood distribution. These differ by distribution so check the code in inference/R/functions.R/logLikStat function.

filtering::hierarchical_stats_geo

The hierarchical settings specified here are used to group the inference of certain parameters together (similar to inference in "hierarchical" or "fixed/group effects" models). For example, users may desire to group all counties in a given state because they are geograhically proximate and impacted by the same statewide policies. The effect should be to make these inferred parameters follow a normal distribution and to observe shrinkage among the variance in these grouped estimates.

ItemRequired?Type/Format

scenario name

required

name of hierarchical scenario, user defined

name

required

name of the estimated parameter that will be grouped (e.g., the NPI scenario name or a standardized, combined health outcome name like probability_incidI_incidC)

module

required

name of the module where this parameter is estimated (important for finding the appropriate files)

geo_group_col

required

geodata column name that should be used to group parameter estimation

transform

required

type of transform that should be applied to the likelihood: "none" or "logit"

filtering::priors

It is now possible to specify prior values for inferred parameters. This will have the effect of speeding up model convergence.

ItemRequired?Type/Format

scenario name

required

name of prior scenario, user defined

name

required

name of NPI scenario or parameter that will have the prior

module

required

name of the module where this parameter is estimated

likelihood

required

specifies the distribution of the prior

Ground truth data

Likelihood function

Fitting parameters

Ground truth data

Last updated