Your privacy, your choice

We use essential cookies to make sure the site can function. We also use optional cookies for advertising, personalisation of content, usage analysis, and social media.

By accepting optional cookies, you consent to the processing of your personal data - including transfers to third parties. Some third parties are outside of the European Economic Area, with varying standards of data protection.

See our privacy policy for more information on the use of your personal data.

for further information and to change your choices.

Skip to main content

Table 2 Overview of pre-trained biomedical language models

From: Transformer models in biomedicine

Study

Data sources

Model architecture

Biomedical tasks

BioBERT

[3]

PubMed, PMC

BERT

Biomedical named entity recognition (NER), relation extraction (RE), and question answering (QA)

PubMedBERT

[21]

PubMed

BERT

Biomedical NER, RE, QA, evidence-based medicine information extraction, document classification, and sentence similarity

BioMegatron

[22]

PubMed

Megatron

Biomedical NER, RE, and QA

BioELECTRA

[23]

PubMed

ELECTRA

Biomedical NER, RE, QA, evidence-based medicine information extraction, document classification, medical natural language inference, and sentence similarity

BioALBERT

[24]

PubMed, PMC

ALBERT

Biomedical NER, RE, QA, evidence-based medicine information extraction, document classification, medical natural language inference, and sentence similarity

BioMed-RoBERTa

[25]

PubMed, ChemProt

RoBERTa

Chemical relation classification

BioGPT

[26]

PubMed

GPT-2

Biomedical RE, QA, document classification, and text generation

ClinicalBERT

[27, 28]

MIMIC-III, i2b2 datasets

BERT

Identification of clinical entities, natural language inferencing

ClinicalXLNet

[29]

MIMIC-III

XLNet

Identifying patient reports with prolonged mechanical ventilation and 90-day mortality

RoBERTa-MIMIC, ALBERT-MIMIC

[30]

MIMIC-III, i2b2 datasets

RoBERTa, ALBERT

Identification of clinical entities

Clinical-Longformer,

Clinical-BigBird

[31]

MIMIC-III

Longformer, BigBird

Document classification, question answering, named entity recognition, natural language inference

GatorTron

[32]

University of Florida Health, MIMIC-III, PubMed, Wikipedia

BERT, BioMegatron

Clinical concept extraction, medical relation extraction, semantic textual similarity, medical natural language inference, medical QA

Bioreddit-BERT

[33]

Reddit health-related articles

BERT

Biomedical named entity recognition, adverse reaction mention detection

Bio-GottBERT

[18]

German medical text

BERT

Identification of procedures, diagnoses, and medications

CamemBERT-bio

[19]

French clinical documents

RoBERTa

Detection of clinical entities

KM-BERT

[20]

Korean medical literature

BERT

Identification of diseases and treatment entities