This page, along with the other AWS run guides, are not deprecated in case we need to run flepiMoP
on AWS again in the future, but also are not maintained as other platforms (such as longleaf and rockfish) are preferred for running production jobs.
see Building a configuration file
Spin up an Ubuntu submission box if not already running. To do this, log onto AWS Console and start the EC2 instance.
Update IP address in .ssh/config file. To do this, open a terminal and type the command below. This will open your config file where you can change the IP to the IP4 assigned to the AWS EC2 instance (see AWS Console for this):
SSH into the box. In the terminal, SSH into your box. Typically we name these instances "staging", so usually the command is:
Now you should be logged onto the AWS submission box.
Update the github repositories. In the below example we assume you are running main
branch in Flu_USA andmain
branch in COVIDScenarioPipeline. This assumes you have already loaded the appropriate repositories on your EC2 instance. Have your GitHub ssh key passphrase handy so you can paste it when prompted (possibly multiple times) with the git pull command. Alternatively, you can add your github key to your batch box so you do not have to log in repeated (see X).
Initiate the docker. Start up and log into the docker container, pull the repos from GitHub, and run setup scripts to setup the environment. This setup code links the docker directories to the existing directories on your box. As this is the case, you should not run job submission simultaneously using this setup, as one job submission might modify the data for another job submission.
To run the via AWS, we first run a setup run locally (in docker on the submission EC2 box) ;
Setup environment variables. Modify the code chunk below and submit in the terminal. We also clear certain files and model output that get generated in the submission process. If these files exist in the repo, they may not get cleared and could cause issues. You need to modify the variable values in the first 4 lines below. These include the SCENARIO
, VALIDATION_DATE
, COVID_MAX_STACK_SIZE
, and COMPUTE_QUEUE
. If submitting multiple jobs, it is recommended to split jobs between 2 queues: Compartment-JQ-1588569569
and Compartment-JQ-1588569574
.
If not resuming off previous run:
If resuming from a previous run, there are an additional couple variables to set. This is the same for a regular resume or continuation resume. Specifically:
RESUME_ID
- the COVID_RUN_INDEX
from the run resuming from.
RESUME_S3
- the S3 bucket where this previous run is stored
Preliminary model run. We do a setup run with 1 to 2 iterations to make sure the model runs and setup input data. This takes several minutes to complete, depending on how complex the simulation will be. To do this, run the following code chunk, with no modification of the code required:
Configure AWS. Assuming that the simulations finish successfully, you will now enter credentials and submit your job onto AWS batch. Enter the following command into the terminal ;
You will be prompted to enter the following items. These can be found in a file called new_user_credentials.csv
;
Access key ID when prompted
Secret access key when prompted
Default region name: us-west-2
Default output: Leave blank when this is prompted and press enter (The Access Key ID and Secret Access Key will be given to you once in a file)
Launch the job. To launch the job, use the appropriate setup based on the type of job you are doing. No modification of these code chunks should be required.
NOTE: Resume and Continuation Resume runs are currently submitted the same way, resuming from an S3 that was generated manually. Typically we will also submit any Continuation Resume run specifying
--resume-carry-seeding
as starting seeding conditions will be manually constructed and put in the S3.
Carrying seeding (do this to use seeding fits from resumed run):
Discarding seeding (do this to refit seeding again):
Single Iteration + Carry seeding (do this to produce additional scenarios where no fitting is required):
NOTE: A Resume and Continuation Resume are currently submitted the same way, but with
--resume-carry-seeding
specified and resuming from an S3 that was generated manually.
Commit files to Github. After the job is successfully submitted, you will now be in a new branch of the population repo. Commit the ground truth data files to the branch on github and then return to the main branch:
Save submission info to slack. We use a slack channel to save the submission information that gets outputted. Copy this to slack so you can identify the job later. Example output: