Exploration of the optimal deep learning model for english-Japanese machine translation of medical device adverse event terminology

Table 1 Score of each machine translation models

Model	BLEU	CER	WER	METEOR	BERT score	Evaluators
mBART50	15.29	0.541	0.719	0.481	0.855	42%
m2m-100–418 M	17.29	0.639	0.731	0.416	0.825	34%
m2m-100-1.2B	21.75	0.679	0.673	0.460	0.831	42%
googletranslation	27.72	0.543	0.581	0.574	0.881	56%
mT5	25.91	0.573	0.601	0.440	0.837	34%
GPT-3	20.85	0.578	0.610	0.452	0.861	54%
ChatGPT	24.69	0.569	0.876	0.571	0.877	72%
GPT-4	35.24	0.424	0.496	0.612	0.892	72%

ISSN: 1472-6947