Por favor, use este identificador para citar o enlazar este ítem:
http://conacyt.repositorioinstitucional.mx/jspui/handle/1000/8152
Towards Pandemic-Scale Ancestral Recombination Graphs of SARS-CoV-2 | |
Shing Zhan Anastasia Ignatieva Yan Wong Katherine Eaton Benjamin Jeffery Duncan Palmer Carmen Lia Murall Sarah Otto Jerome Kelleher | |
Acceso Abierto | |
Atribución-NoComercial-SinDerivadas | |
https://doi.org/10.1101/2023.06.08.544212 | |
https://www.biorxiv.org/content/10.1101/2023.06.08.544212v1 | |
Recombination is an ongoing and increasingly important feature of circulating lineages of SARS-CoV-2, challenging how we represent the evolutionary history of this virus and giving rise to new variants of potential public health concern by combining transmission and immune evasion properties of different lineages. Detection of new recombinant strains is challenging, with most methods looking for breaks between sets of mutations that characterise distinct lineages. In addition, many basic approaches fundamental to the study of viral evolution assume that recombination is negligible, in that a single phylogenetic tree can represent the genetic ancestry of the circulating strains. Here we present an initial version of sc2ts, a method to automatically detect recombinants in real time and to cohesively integrate them into a genealogy in the form of an ancestral recombination graph (ARG), which jointly records mutation, recombination and genetic inheritance. We infer two ARGs under different sampling strategies, and study their properties. One contains 1.27 million sequences sampled up to June 30, 2021, and the second is more sparsely sampled, consisting of 657K sequences sampled up to June 30, 2022. We find that both ARGs are highly consistent with known features of SARS-CoV-2 evolution, recovering the basic backbone phylogeny, mutational spectra, and recapitulating details on the majority of known recombinant lineages. Using the well-established and feature-rich tskit library, the ARGs can also be stored concisely and processed efficiently using standard Python tools. For example, the ARG for 1.27 million sequences—encoding the inferred reticulate ancestry, genetic variation, and extensive metadata—requires 58MB of storage, and loads in less than a second. The ability to fully integrate the effects of recombination into downstream analyses, to quickly and automatically detect new recombinants, and to utilise an efficient and convenient platform for computation based on well-engineered technologies makes sc2ts a promising approach. | |
bioRxiv | |
08-06-2023 | |
Preimpreso | |
Inglés | |
Público en general | |
VIRUS RESPIRATORIOS | |
Aparece en las colecciones: | Materiales de Consulta y Comunicados Técnicos |
Cargar archivos:
Fichero | Tamaño | Formato | |
---|---|---|---|
Towards Pandemic-Scale Ancestral Recombination Graphs of SARS-CoV-2.pdf | 1.95 MB | Adobe PDF | Visualizar/Abrir |