Article, 2023

S1000: a better taxonomic name corpus for biomedical information extraction

Bioinformatics, ISSN 1367-4803, 1367-4811, Volume 39, 6, 10.1093/bioinformatics/btad369

Contributors

Luoma J. [1] Nastou K.C.

0000-0003-3611-5726 [2] Ohta T. Toivonen H. [1] Pafilis E. [3] Jensen L.J.

0000-0001-7885-715X (Corresponding author) [2] Pyysalo S. (Corresponding author) [1]

Affiliations

[1] University of Turku

[NORA names:

[2] University of Copenhagen

[NORA names:

KU University of Copenhagen

University

[3] Hellenic Centre for Marine Research

[NORA names:

Abstract

Motivation: The recognition of mentions of species names in text is a critically important task for biomedical text mining. While deep learning-based methods have made great advances in many named entity recognition tasks, results for species name recognition remain poor. We hypothesize that this is primarily due to the lack of appropriate corpora. Results: We introduce the S1000 corpus, a comprehensive manual re-annotation and extension of the S800 corpus. We demonstrate that S1000 makes highly accurate recognition of species names possible (F-score =93.1%), both for deep learning and dictionary-based methods.

Funders

Horizon 2020 Framework Programme
Novo Nordisk Fonden
H2020 Marie Skłodowska-Curie Actions
Suomen Akatemia

S1000: a better taxonomic name corpus for biomedical information extraction

Contributors

Affiliations

Abstract

Funders

Data Provider: Elsevier

LINKS
-

Matching Records in NORA

SUBJECTS
+

DK Main Research Area

UN SDG Classification

DK Green Classification

OECD Classification

AU/NZ FOR Classification

METRICS
+

Citation Metrics

Attention Metrics

Attention Metrics

DK Open Access Indicator

Contributors

Affiliations

Abstract

Funders

Data Provider: Elsevier

LINKS-

Matching Records in NORA

SUBJECTS+

DK Main Research Area

UN SDG Classification

DK Green Classification

OECD Classification

AU/NZ FOR Classification

METRICS+

Citation Metrics

Attention Metrics

Attention Metrics

DK Open Access Indicator

Matching Records in NORA

DK Open Access Indicator

DK Green Classification

LINKS
-

SUBJECTS
+

METRICS
+