Skip to main content

Table 1 Performance of proposed de-identification method on Yellow Card test data

From: Automated redaction of names in adverse event reports using transformer-based neural networks

 

Performance in YC test data:

Tokens in YC test data:

Token length

Precision

Recall

F1

False positive rate

NAMES

NON-NAMES

All

55%

87%

67%

0.05%

179

263,272

Long (> 3)

58%

94%

72%

0.04%

108

162,582

Short (≤ 3)

50%

75%

60%

0.05%

71

100,690

  1. bold: best score, YC: Yellow Card