NFDI4DS | UHH-SEMS - Publication Details

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

Biomedical text mining Named Entity Recognition Relationship extraction Text corpus Representation F1 score

DOI: 10.1093/bioinformatics/btz682 Publication Date: 2019-09-05T19:27:43Z

Abstract Supplemental Material References Cited by

AUTHORS (7)

Jinhyuk Lee

Wonjin Yoon

Sungdong Kim

Donghyeon Kim

Sunkyu Kim

Chan Ho So

Jaewoo Kang

ABSTRACT

Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With progress in natural language processing (NLP), extracting valuable information from literature has gained popularity among researchers, and deep learning boosted development effective models. However, directly applying advancements NLP to often yields unsatisfactory results due a word distribution shift general domain corpora corpora. In this article, we investigate how recently introduced pre-trained model BERT can be adapted for We introduce BioBERT (Bidirectional Encoder Representations Transformers Text Mining), which domain-specific representation on large-scale almost same architecture across tasks, largely outperforms previous state-of-the-art models variety tasks when While obtains performance comparable that models, significantly them following three representative tasks: named entity recognition (0.62% F1 score improvement), relation extraction (2.80% improvement) question answering (12.24% MRR improvement). Our analysis show pre-training helps it understand complex texts. make weights freely available at https://github.com/naver/biobert-pretrained, source code fine-tuning https://github.com/dmis-lab/biobert.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (38)

CITATIONS (3624)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....