From: Risk-based evaluation of machine learning-based classification methods used for medical devices
General / overarching definitions | |
Number of actual positive cases: \(\:{P}={T}{P}+{F}{N}\) | Number of actual negative cases: \(\:N=TN+FP\) |
Number of predicted positive cases: \(\:{P}{P}={T}{P}+{F}{P}\) | Number of predicted negative cases: \(\:PN=TN+FN\) |
Total Population: \(\:{P}{o}{p}={P}+{N}\) | Prevalence: \(\:Prev=\frac{P}{P+N}=\frac{P}{Pop}\) |
Metrics documented in the literature research within this study | |
Sensitivity / Recall / True Positive Rate: \(\:{T}{P}{R}=\frac{{T}{P}}{{P}}\) | Specificity / True Negative Rate: \(\:TPN=\frac{TN}{N}\) |
Accuracy: \(\:{A}{c}{c}=\frac{{T}{P}+{T}{N}}{{T}{P}+{F}{P}+{T}{N}+{F}{N}}\) or equivalently Error rate: \(\:{E}{r}{r}=1-{A}{c}{c}\) | Balanced Accuracy, i.e. accuracy after balancing of positive / negative test samples / class members: \(\:BA=\:\frac{TPR+TNR}{2}\) |
Precision / Positive Predicted Value: \(\:{P}{P}{V}=\frac{{T}{P}}{{P}{P}}\) | Negative Predictive Value: \(\:NPV=\frac{TN}{PN}\) |
\(\varvec{F_1}\)-Score: \(\:{F}1=2\cdot\:\frac{{P}{P}{V}\cdot\:{T}{P}{R}}{{P}{P}{V}+\:{T}{P}{R}}\) | other \(\varvec{F_{\beta\:}}\)-Scores: \(\:F\beta\:=\left(1+{\beta\:}^{2}\right)\cdot\:\frac{PPV\cdot\:TPR}{{\beta\:}^{2}\cdot\:PPV+\:TPR}\) |
Matthews Correlation Coefficient: \(\quad\quad MCC=\sqrt{TPR\cdot\:TNR\cdot\:PPV\cdot\:NPV}-\sqrt{\left(1-TPR\right)\cdot\:\left(1-TNR\right)\cdot\:\left(1-PPV\right)\cdot\:\left(1-NPV\right)}\) | Geometric Mean: \(\:MCC=\:\sqrt{TPR\cdot\:TNR}\) |
Measures which include not single models (fixed threshold) but multiple variations of thresholds | |
Receiver Operating Characteristics (ROC) Curve, i.e. plot of \(\:{F}{P}{R}\) (on \(\:{x}\) axis) vs. \(\:{T}{P}{R}\) (on y axis). | Precision-Recall Curve (PRC), i.e. plot of recall / \(\:TPR\) (on \(\:x\) axis) vs. precision / \(\:PPV\) (on \(\:y\) axis). |
Area under the ROC Curve: \(\:AUROC=\int_0^1{ROC\left(x\right)\:dx\:}\) as the integral over the function \(\:{R}{O}{C}\left({x}\right)\) described by the \(\:{R}{O}{C}\) Curve | Area under the PRC Curve: \(\:AUPRC=\int_0^1PRC\left(x\right)\:dx\) as the integral over the function \(\:PRC\left(x\right)\) described by the \(\:PRC\) Curve |
Measures for comparison of two predictions | |
(Cohen’s) Kappa: \(\:{\kappa\:}=\frac{{{p}}_{0}-{{p}}_{{c}}}{1-{{p}}_{{c}}}\) where \(\:{{p}}_{0}\) is the agreement between the predictions and \(\:{{p}}_{{c}}\) is the agreement with respect to a random prediction | (Cohen’s) Weighted Kappa: (Cohens’s) Kappa \(\:\kappa\:\) with additional weights included, e.g. according to risks or costs |