Por favor, use este identificador para citar o enlazar este ítem:
http://conacyt.repositorioinstitucional.mx/jspui/handle/1000/8700
Scrutinising the COVID-19 data on 590.000 cases. A retrospective, population-based descriptive study for data quality surveillance and a review at 4.540.000 cases | |
Oriol Gallemí i Rovira | |
Acceso Abierto | |
Atribución-NoComercial-SinDerivadas | |
https://doi.org/10.1101/2020.05.26.20113316 | |
https://www.medrxiv.org/content/10.1101/2020.05.26.20113316v1 | |
Background Reports on the detected positive patients with COVID-19 are as per today the best estimation of a country spread of the pandemic. In order to evaluate the early indicators for true lethality and recovery time, the data where the model is built must be quality checked. Each country sets different procedures and criteria for fatality count due to COVID-19 and the health system is stressed by having insufficient testing, untracked patients and premature discharge. In this paper the dynamics behind such data quality issues are discussed throughout the disease course to support better modeling and decision-making processes in a stressed healthcare system.
Methods Based on data compiled and relayed by the Johns Hopkins University, tracking COVID-19 over 590.000 patients (march 27th, 2020), the data is clustered and compared with discrete regression. Regression parameters are restricted by a time interval of 1 day and must be meaningful for the diagnostic (i.e. a fatality cannot occur before the patient displays symptoms). Cumulative infection curves are taken and built. Infection baseline is based on the country official declaration. Infection synthetic curves are built from the Fatality count and the Recovered patient count. The adjusted parameters are τ=time to fatality (days), δ=time to discharge of recovered patients (days) and φ=case fatality rate (CFR in per unit, P.U.). Therefore, the discharge rate (recovery rate) is forced to be (1-φ).
Using forward or backward formulas have no other influence than the time reference. In both circumstances, time from Onset and Symptoms are neglected and shall be added if such dates are to be plot. There is a gap of two weeks since exposure to Hospital Admission to detection and the earlier the diagnose is done, the better the outcome.
Cumulative figures are used to smoothen the deviation and to provide the best estimator possible at the present time. The delay factor allows to compare figures belonging to the same date of detection.
Fast, daily models which can be used and integrated to a filtering stage on the parameter estimator in a complex approach are left out of scope. Continuous models can also be used and interpolation among the data points is another source of noise to be considered, especially when counting methods are suddenly changing as it is the case with COVID-19. Countries were grouped as found representative for methodology illustration purposes. Results are discussed and compared across the different groups and potential indicators of this behavior are drawn for further study. Findings From 593.291 cases in the sample, and its 7 representative groups, the recovery time and the local CFR are negatively correlated, having the highest fatality rates (21%, Spain) the countries with shorter recovery time (11 days, Spain). Also, CFR can be an indicator of Infection inconsistencies (i.e. South Korea, CFR 1%, Time to recovery 25 days). At the review part, focus is made on the inconsistencies detected in Germany and South Korea datasets as well as the potential misfits on China and Spain. Overall, the Time to Fatality ranges between 4 and 8 days, and the mean is of 6 days (South Korea, 7 days; Japan, 6days). Only Germany and France are detecting earlier than other countries and admit 10 days before fatality occurs. To date, shortening hospital discharge times seem to lead to patient reinfections (COVID-19 positive), and studies are working on this line. Interpretation One simple explanation for the local CFR and Recovery time correlation is to define such rate as a measure of the healthcare system overload. Anomalous CFR indexes point to a stressed healthcare system. The higher the overload, the more focus on critical cases and hence the higher local CFR. The COVID-19 intrinsic CFR is unlikely to change by a factor of 10x from countries with similar lifestyle, GDP per capita and health services (i.e. the Mediterranean Basin, Northern Europe, etc.). Because of this fact, early CFR measured before Healthcare system overwhelming (COVID-19 free flow) are considered to be more accurate than the measured CFR while the outbreak is still ongoing, Finally, the synthetic Infection indexes may be a helpful indirect measure of the real population infection rate and also used for data quality audit. Any model built upon inconsistent data will be complex to explain and justify. | |
bioRxiv | |
27-05-2020 | |
Preimpreso | |
Inglés | |
Público en general | |
VIRUS RESPIRATORIOS | |
Aparece en las colecciones: | Materiales de Consulta y Comunicados Técnicos |
Cargar archivos: