BMC Medical Informatics and Decision Making

Table 2 Performance of different components of de-identification method when applied separately on Yellow Card test data

From: Automated redaction of names in adverse event reports using transformer-based neural networks

	Performance in YC test data:					Tokens in YC test data:
Token length	Component	Precision	Recall	F1	False positive rate	NAMES	NON-NAMES
All	BERT	55%	87%	67%	0.05%	179	263,272
	BERT + rules	26%	88%	40%	0.17%	179	263,272
	Rules alone	13%	26%	17%	0.12%	179	263,272
Long (> 3)	BERT	58%	94%	72%	0.04%	108	162,582
	BERT + rules	32%	96%	48%	0.14%	108	162,582
	Rules alone	20%	31%	24%	0.08%	108	162,582
Short (≤ 3)	BERT	50%	75%	60%	0.05%	71	100,690
	BERT + rules	19%	75%	30%	0.23%	71	100,690
	Rules alone	6%	17%	9%	0.17%	71	100,690

bold: best score, YC: Yellow Card

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com