Source author record

Yingjun Guan

Yingjun Guan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Retrieval Artificial Intelligence Computation and Language

Catalog footprint

What is connected

2works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Automatic Textual Evidence Mining in COVID-19 Literature

We created this EVIDENCEMINER system for automatic textual evidence mining in COVID-19 literature. EVIDENCEMINER is a web-based system that lets users query a natural language statement and automatically retrieves textual evidence from a background corpora for life sciences. It is constructed in a completely automated way without any human effort for training data annotation. EVIDENCEMINER is supported by novel data-driven methods for distantly supervised named entity recognition and open information extraction. The named entities and meta-patterns are pre-computed and indexed offline to support fast online evidence retrieval. The annotation results are also highlighted in the original document for better visualization. EVIDENCEMINER also includes analytic functionalities such as the most frequent entity and relation summarization.

preprint2020arXiv

Comprehensive Named Entity Recognition on CORD-19 with Distant or Weak Supervision

We created this CORD-NER dataset with comprehensive named entity recognition (NER) on the COVID-19 Open Research Dataset Challenge (CORD-19) corpus (2020-03-13). This CORD-NER dataset covers 75 fine-grained entity types: In addition to the common biomedical entity types (e.g., genes, chemicals and diseases), it covers many new entity types related explicitly to the COVID-19 studies (e.g., coronaviruses, viral proteins, evolution, materials, substrates and immune responses), which may benefit research on COVID-19 related virus, spreading mechanisms, and potential vaccines. CORD-NER annotation is a combination of four sources with different NER methods. The quality of CORD-NER annotation surpasses SciSpacy (over 10% higher on the F1 score based on a sample set of documents), a fully supervised BioNER tool. Moreover, CORD-NER supports incrementally adding new documents as well as adding new entity types when needed by adding dozens of seeds as the input examples. We will constantly update CORD-NER based on the incremental updates of the CORD-19 corpus and the improvement of our system.

Yingjun Guan

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

Automatic Textual Evidence Mining in COVID-19 Literature

Comprehensive Named Entity Recognition on CORD-19 with Distant or Weak Supervision