Validation | Classification algorithm | ROC-AUC (95% CI) | Sensitivity (%, 95% CI) | Specificity (%, 95% CI) | PPV (%, 95% CI) | NPV (%, 95% CI) | P-value |
---|---|---|---|---|---|---|---|
Internal validation | Logistic regression | 0.840 (0.818, 0.860) | 75.0 (34.9, 96.8) | 85.6 (83.5, 87.5) | 3.3 (2.1, 4.9) | 99.8 (99.3, 99.9) | Reference |
Decision tree | 0.569 (0.541, 0.597) | 100.0 (63.1, 100.0) | 13.8 (11.9, 15.9) | 0.7 (0.7, 0.8) | 100.0 (97.8, 100.0) | < 0.001 | |
Gradient Boosting | 0.699 (0.673, 0.725) | 87.5 (47.3, 99.6) | 58.5 (55.7, 61.2) | 1.3 (1.0, 1.7) | 99.8 (99.1, 99.9) | 0.003 | |
Random Forest | 0.763 (0.739, 0.787) | 75.0 (34.9, 96.8) | 71.5 (68.8, 73.9) | 1.7 (1.1, 2.5) | 99.7 (99.2, 99.9) | 0.049 | |
naïve Bayes | 0.847 (0.826, 0.867) | 100.0 (63.1, 100.0) | 67.4 (64.7, 70.1) | 1.9 (1.8, 2.1) | 100.0 (99.5, 100.0) | 0.837 | |
ANN | 0.856 (0.835, 0.875) | 75.0 (34.9, 96.8) | 84.5 (82.4, 86.5) | 3.1 (2.0, 4.6) | 99.8 (99.3, 99.9) | 0.601 | |
Google Vertex AI | 0.842 (0.820, 0.861) | 100.0 (63.1, 100.0) | 66.7 (63.9, 69.3) | 1.9 (1.7, 2.0) | 100.0 (99.5, 100.0) | 0.848 | |
External validation | Logistic regression | 0.738 (0.716, 0.759) | 62.5 (24.5, 91.4) | 82.9 (81.1, 84.7) | 1.7 (0.9, 2.9) | 99.7 (99.4, 99.9) | Reference |
Decision tree | 0.546 (0.522, 0.570) | 87.5 (47.3, 99.6) | 21.5 (19.5, 23.5) | 0.5 (0.4, 0.7) | 99.7 (98.3, 99.9) | 0.008 | |
Gradient Boosting | 0.693 (0.670, 0.715) | 87.5 (47.3, 99.6) | 48.5 (46.1, 50.9) | 0.8 (0.6, 1.0) | 99.8 (99.2, 99.9) | 0.612 | |
Random Forest | 0.746 (0.724, 0.766) | 75.0 (34.9, 96.8) | 75.6 (73.5, 77.7) | 1.4 (0.9, 2.1) | 99.8 (00.4, 99.9) | 0.898 | |
naïve Bayes | 0.760 (0.739, 0.780) | 87.5 (47.3, 99.6) | 70.5 (68.3, 72.7) | 1.4 (1.0, 1.8) | 99.9 (99.4, 99.9) | 0.765 | |
ANN | 0.784 (0.763, 0.803) | 62.5 (24.5, 91.4) | 93.7 (92.4, 94.8) | 4.5 (2.6, 7.7) | 99.8 (99.5, 99.9) | 0.044 | |
Google Vertex AI | 0.761 (0.740, 0.781) | 87.5 (47.3, 99.6) | 70.7 (68.5, 72.9) | 1.4 (1.0, 1.8) | 99.9 (99.4–99.9) | 0.756 |