Skip to main content

Table 1 Recent advances in the State-of-the-Art of NER applied to EHR

From: Hierarchical embedding attention for overall survival prediction in lung cancer from unstructured EHRs

Model

Ref.

NER Dataset

Application

Entities

NER Performance

Entity Usage

MC-BERT + BiLSTM + CNN + MHA + CRF

[4]

CCKS17 [5], CCKS19 [6], cEHRNER [7]

NER in clinical notes

9 entities: Body, Treatment, Signs, Check, Disease, Lab, Medicine, Operation, Symptom

F1: 94.2%, 86.5%, 92.3% on CCKS17, CCKS19, cEHRNER

None

BiLSTM-CNN-Char

[8]

2010 i2b2/VA [9], 2014 n2c2 [10], 2018 n2c2 [11]

NER in clinical notes

4 entities: Medical Problem, Treatment, Test, Drug

F1: 87.6%, 96.1%, 89.9% on i2b2/VA, 2014 n2c2, 2018 n2c2

None

MUSA-BiLSTM-CRF

[12]

CCKS17 [5], CCKS18 [13]

NERin clinical notes

5 entities: Disease, Symptom, Examination, Treatment, Body part

F1: 92.0%, 91.8% on CCKS17, CCKS18

None

BERT

[14]

2018 n2c2 [11], 2009 n2c2 [15], 2010 n2c2 [9], 2012 n2c2 [16], ShARe13 [17]

NER in clinical notes

4 entities: Drugs, Dosages, Reasons, Adverse drug events

F1: 90.0%, 80.9%, 88.4%, 87.5%, 82.6% on 2018 n2c2, 2009 n2c2, 2010 n2c2, 2012 n2c2, ShARe13

None

BERT-BiLSTM-CRF

[18]

ShARe13 [17], ShARe14 [19]

NER in clinical notes

1 entity: Disorder

F1: 79.9%, 80.7% on ShARe13, ShARe14

None

BERT

[20]

i2b2-2010 [9], VietBioNER [21]

NER in clinical notes

3 entities: Medical Problem, Treatment, Tests

F1: 87.7%, 80.9% on i2b2-2010, VietBioNER

None

CancerBERT

[22]

Proprietary dataset (EHRs)

Breast cancer phenotypes

8 entities: Hormone receptor type, Hormone receptor status, Tumor size, Tumor site, Cancer grade, Histological type, Tumor laterality, Cancer stage

F1: 87.6%

None

scispaCy

[23]

MIMIC-III [24]

NER in clinical notes

2 entities: Disease, Chemical

None

Mortality prediction

med7

[25]

MIMIC-III [24]

NER in clinical notes

7 entities: Dosage, Drug, Duration, Form, Frequency, Route, Strength

None

Mortality prediction

Rule-based

[26]

CCKS20 [27], gastroscopy text dataset, mixed dataset

Breast cancer phenotypes

6 entities: Disease, Anatomy, Imaging, Lab, Drug, Operation

F1: 87.9%, 99.8%, 96.2% on CCKS20, gastroscopy text, mixed dataset

None

Ensemble of CRF, multilingual Transformers (BERT, XLM RoBERTa) and LSTM

[28]

Proprietary dataset (hospital EHRs)

NER in clinical notes

11 entities: Clinical Dept, Date, Duration, Evidential, Frequency, Occurrence, Problem, Test, Time, Treatment, Value

F1: 89.2%

None

RoBERTa

[29]

Proprietary dataset (hospital EHRs)

Breast cancer information

23 entities related to Breast Cancer domain

F1: 95.0%

None