Por favor, use este identificador para citar o enlazar este ítem: http://conacyt.repositorioinstitucional.mx/jspui/handle/1000/4313
Comprehensive Named Entity Recognition on CORD-19 with Distant or Weak Supervision
Xuan Wang.
Xiangchen Song.
Bangzheng Li.
Yingjun Guan.
Jiawei Han.
Acceso Abierto
Atribución-NoComercial-SinDerivadas
https://arxiv.org/pdf/2003.12218v5.pdf
We created this CORD-NER dataset with comprehensive named entity recognition (NER) on the COVID-19 Open Research Dataset Challenge (CORD-19) corpus (2020-03-13). This CORD-NER dataset covers 75 fine-grained entity types: In addition to the common biomedical entity types (e.g., genes, chemicals and diseases), it covers many new entity types related explicitly to the COVID-19 studies (e.g., coronaviruses, viral proteins, evolution, materials, substrates and immune responses), which may benefit research on COVID-19 related virus, spreading mechanisms, and potential vaccines. CORD-NER annotation is a combination of four sources with different NER methods. The quality of CORD-NER annotation surpasses SciSpacy (over 10% higher on the F1 score based on a sample set of documents), a fully supervised BioNER tool. Moreover, CORD-NER supports incrementally adding new documents as well as adding new entity types when needed by adding dozens of seeds as the input examples. We will constantly update CORD-NER based on the incremental updates of the CORD-19 corpus and the improvement of our system.
arxiv.org
2020
Artículo
https://arxiv.org/pdf/2003.12218v5.pdf
Inglés
VIRUS RESPIRATORIOS
Aparece en las colecciones: Artículos científicos

Cargar archivos:


Fichero Tamaño Formato  
1106292.pdf1.15 MBAdobe PDFVisualizar/Abrir