In an effort to capture lessons from Chelsea’s pandemic response efforts against COVID-19, CCI researchers have built an empirical Bayes agents-based policy simulator for the Chelsea Project. This tool models demographics, mobility, transmission, infection, and disease dynamics using a large number of datasets such as anonymized aggregated phone location data for Massachusetts, in addition to census data.
This model is calibrated using a number of posterior predictive checks using ground truth data, including COVID-19 outcome data collected by the state, wastewater data collected by CCI-TCP, and hospital data. Chelsea city officials can use the simulator to retrospectively simulate the effects of different policy interventions on COVID-19 outcomes. For example, the simulator can predict what would have happened if mask-wearing at the city and state levels were absent or lower in 2020.
By mapping out alternative scenarios, the policy simulator can be used as an educational tool for policy makers, and can help the city advocate for resources that may prevent future public health crises.
CCI-TCP wastewater data is in agreement with state testing data and hospital data.
Mask wearing made a huge difference in Chelsea: less than half the peak number of cases and for less time.
Helping Chelsea directly helps the rest of the state: we simulate that the population of people affected by Chelsea infections is more than 529,000.
We have built and deployed an Empirical Bayes agent-based model that uses Monte-Carlo sampling of distributions built from a variety of datasets (see “Data used” section below) to model mobility, infection and transmission at the individual level for the 6.8 million inhabitants of the state of Massachusetts from Feb 1 2020 to March 1 2021 (note: for now, we end the simulation in March 2021 because this is when variants and vaccination become more prevalent, which we do not model yet).
The model was first calibrated at the state level, after which we used a variety of posterior predictive checks to verify the accuracy of our models in Chelsea against three “ground-truth” measures of COVID-19 cases in the city: MAVEN hospital data, CCI-TCP wastewater data, and the DPH city-level Public Health Reports.
Anonymized aggregated mobility data describing the mobility between census block groups and points of interest from Safegraph. This allows us to create a transport layer for disease transmission between home and places visited for work, school, business, leisure, etc. Overall, we model the mobility between more than 4,985 census block groups and 77,000+ points of interest (e.g. a Dunkin Donuts or a school) in the state.
Social distancing index from the University of Maryland
Disease dynamics (e.g. median probability of hospitalization) and the COVASIM model from IDM
Census data to build prior distributions from which demographics (e.g. age, gender, etc.) of people within a census block group (CBG) are sampled
Relative time spent at home to model transmission at home.
Our model requires only 8 parameters to be fitted (and is therefore very parsimonious) even though it is deployed and predicts infection, transmission, and disease progression at the level of the whole state.
The 8 parameters over which we perform random search and use rejection sampling to obtain the calibrated model are:
Start date of infection (date of patient zero introduction)
Base probability of transmission between individuals
Date of introduction of face masks and fractional effect of masking on the probability of infection
Proportion of community contacts within a census block group
e.g. as people walk around in their neighborhood
Multiplicative factor onto base probability of infection:
between people as they commute and travel
e.g. from home to work or to places of entertainment (depends on square footage area of business and median time spent at location)
between people at home
e.g within-family transmission (depends on time spent at home, household size)
between people in their community
e.g. as people walk in their neighborhood (depends on density of block)
As shown in the figure below, the median prediction of the model (thick blue line) and the 10-90th credible interval (shaded area) are in agreement with the ground truth, both during the first wave (Feb to Mar 2020) and the second wave (Sep 2020 to March 2021).
Our calibrated model described earlier represents our best understanding of how infections spread in Chelsea in 2020. Through this model, we can simulate retrospectively what would have happened if policies had been implemented differently both at the local (Chelsea) and state level.
For example, we can infer what effect a reduction in mask wearing starting in early April would have had, and compare this inference to our calibrated model, which assumes that masks had a 30% reduction effect on the probability of transmission.
Note that the blue curve below is the same as the blue curve of the calibrated model in the earlier plot. The red curve represents a simulated scenario where mask wearing was reduced or absent, but all else is equal.
As shown in the figure above, lack of mask-wearing leads to more than twice the number of peak daily cases, and for much longer: both the peak and the amount of time we have infection is higher. This suggests that mask wearing made a huge difference!
This is the power of a retrospective policy simulator: it allows us to simulate what would have happened if different policies had happened given a model.
Because we model mobility, transmission, and infection at the state level, we can investigate where infections travel from Chelsea to outside, as well as their point of origin from outside to Chelsea. Below is a map of how the virus from somebody infected in Chelsea travelled outside Chelsea as inhabitants travel for work, school, business, leisure, etc. Note how far and wide these exposures travel.
Each red zone is made up of census block groups. The total population of the red areas below is more than 529,000! And this is just for primary exposure, not the secondary exposures that happen as the second infected person infects others. Infection in Chelsea is not just a Chelsea problem!
Similarly, we analyzed the network of people who infected who: e.g. if Chelsea person A infects non-Chelsea persons B and C, and B and C go on to infect many other people, we say that B, C and others infected by B and C connected to A. This is perhaps a more valuable number than R0 as it allows us to understand the disease beyond just primary exposure. We find that the average number of infections that were started by people in Chelsea is 15. Note that this number is not particularly higher than in other cities, we only aim to illustrate the exponential effect of COVID-19 infection in general.
Please note the distinction between “exposure” (e.g I have the virus and I just sneezed next to you) vs “infection” (the sneeze didn’t infect you because you have immunity, etc). Please also note that this map only shows primary exposure (e.g. person A from Chelsea exposed some other person B at a Dunkin Donuts on the Cape. Now B goes on to infect other people, these are now secondary exposures, not shown on the map).
Similarly, we can look at how infected people are bringing infection into Chelsea. As expected, the locations from which infection arrives are more concentrated, because more people travel from Chelsea to outside (as shown earlier, people from Chelsea work all over the state) than travel to Chelsea from outside.