Por favor, use este identificador para citar o enlazar este ítem: http://conacyt.repositorioinstitucional.mx/jspui/handle/1000/8105
Clinical Phenotype Prediction From Single-cell RNA-seq Data using Attention-Based Neural Networks
Yuzhen Mao
Yen-Yi Lin
Nelson Wong
Stanislav Volik
Funda SAR
Colin C Collins
Martin Ester
Acceso Abierto
Atribución-NoComercial-SinDerivadas
https://doi.org/10.1101/2023.03.31.532253
https://www.biorxiv.org/content/10.1101/2023.03.31.532253v1
Abstract Motivation A patient’s disease phenotype can be driven and determined by specific groups of cells whose marker genes are either unknown, or can only be detected at late-stage using conventional bulk assays such as RNA-Seq technology. Recent advances in single-cell RNA sequencing (scRNA-seq) enable gene expression profiling in cell-level resolution, and therefore have the potential to identify those cells driving the disease phenotype even while the number of these cells is small. However, most existing methods rely heavily on accurate cell type detection, and the number of available annotated samples is usually too small for training deep learning predictive models. Results Here we propose the method ScRAT for clinical phenotype prediction using scRNA-seq data. To train ScRAT with a limited number of samples of different phenotypes, such as COVID and non-COVID, ScRAT first applies a mixup module to increase the number of training samples. A multi-head attention mechanism is employed to learn the most informative cells for each phenotype without relying on a given cell type annotation. Using three public COVID datasets, we show that ScRAT outperforms other phenotype prediction methods. The performance edge of ScRAT over its competitors increases as the number of training samples decreases, indicating the efficacy of our sample mixup. Critical cell types detected based on high-attention cells also support novel findings in the original papers and the recent literature. This suggests that ScRAT overcomes the challenge of missing marker genes and limited sample number with great potential revealing novel molecular mechanisms and/or therapies.
bioRxiv
02-04-2023
Preimpreso
Inglés
Público en general
VIRUS RESPIRATORIOS
Aparece en las colecciones: Materiales de Consulta y Comunicados Técnicos

Cargar archivos: