Please use this identifier to cite or link to this item:
Early Stage Prediction of US County Vulnerability to the COVID-19 Pandemic
Mihir Mehta.
Juxihong Julaiti.
Paul Griffin.
Soundar Kumara.
Acceso Abierto
Key Points: Question: What are key factors that define the vulnerability of counties in the US to cases of the COVID-19 virus? Findings: In this epidemiological study based on publicly available data, we develop a model that predicts vulnerability to COVID-19 for each US county in terms of likelihood of going from no documented cases to at least one case within five days and in terms of number of occurrences of the virus. Meaning: Predicting county vulnerability to COVID-19 can assist health organizations to better plan for resource and workforce needs. Abstract Importance: The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to rapidly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread. Objective: Developing county level prediction around near future disease movement for COVID-19 occurrences using publicly available data. Design: Original Investigation; Decision Analytical Model Study for County Level COVID-19 occurrences using data from March 14-31, 2020. Setting: Disease spread prediction for US counties. Participants: All US county level granularity based on data fused from multiple publicly available sources inclusive of health statistics, demographics, and geographical features. Exposure(s) (for observational studies): Daily county level reported COVID-19 occurrences from March 14-31, 2020. Main Outcome(s) and Measure(s): We developed a 3-stage model to quantify, firstly the probability of COVID-19 occurrence for unaffected counties using XGBoost classifier and secondly, the number of potential occurrences of a county via XGBoost regression. Thirdly, these results are combined to compute the county level risk. This risk is then used as an estimated after-five-day-vulnerability of the county. Results: Using data from March 14-31, 2020, the model shows a sensitivity over 71.5% and specificity over 94%. Conclusions and Relevance: We found that population, population density, percentage of people aged 70 or greater and prevalence of comorbidities play an important role in predicting COVID-19 occurrences. We found a positive association between affected and urban counties as well as less vulnerable and rural counties. The developed model can be used for identification of vulnerable counties and potential data discrepancies. Limited testing facilities and delayed results introduces significant variation in reported cases and produces a bias in the model. Trial Registration: Not Applicable
Appears in Collections:Artículos científicos

Upload archives

File SizeFormat 
1104035.pdf818.95 kBAdobe PDFView/Open