Skip to main content

Table 4 P-values from DeLong’s test for AUC comparisons between XGBoost models (36 vs. 18 features) and scoring systems APACHE II and SOFA

From: Developing a high-performance AI model for spontaneous intracerebral hemorrhage mortality prediction using machine learning in ICU settings

Model

Accuracy

Sensitivity

Specificity

F1-score

AUC

P-value (vs. XGBoost 36) *

P-value (vs. XGBoost 18) **

XGBoost (36 features)

0.844

0.756

0.866

0.657

0.913

-

1

XGBoost (18 features)

0.828

0.826

0.829

0.654

0.913

1

-

APACHE II

0.725

0.733

0.723

0.490

0.826

< 0.001

< 0.001

SOFA

0.727

0.698

0.734

0.520

0.788

< 0.001

< 0.001

  1. Note: *P-value (vs. XGBoost 36): DeLong’s test comparing AUC with XGBoost (36 features). **P-value (vs. XGBoost 18): DeLong’s test comparing AUC with XGBoost (18 features)