Skip to main content

Table 5 Results of KoBERT-NER by training level

From: De-identification of clinical notes with pseudo-labeling using regular expression rules and pre-trained BERT

Training level

Base KoBERT-NER (0)

Small (8,000)

Medium (16,000)

Large1 (32,000)

Large2 (31,884)

Precision

Recall

Precision

Recall

Precision

Recall

Precision

Recall

Precision

Recall

DAT

0.49

0.43

0.87

0.74

0.81

0.79

0.99

0.98

0.98

0.99

PER

0.46

0.40

0.95

0.70

0.83

0.81

0.89

0.93

0.96

0.93

ORG

0.02

0.47

0.31

0.40

0.33

0.80

0.69

0.96

0.84

0.96

NUM

0.02

0.77

0.75

0.38

1.00

0.43

1.00

0.57

1.00

0.57

LOC

0.50

0.11

0.40

0.22

0.67

0.44

0.81

0.78

0.86

0.89

ETC

0.0

0.0

0.25

0.20

1.00

0.60

1.00

1.00

1.00

1.00

Total

0.46

0.43

0.83

0.71

0.79

0.7

0.96

0.97

0.98

0.98

  1. Precision and Recall of each training level. Number in parentheses for each level indicates the number of notes in a train dataset for fine-tuning. As the number of trained notes increased, the accuracy of all categories increased