Skip to main content

Table 5 The results of the developed Machine Learning (ML) models on the external validation dataset are presented along with their respective 95% confidence intervals (C.I.)

From: Second opinion machine learning for fast-track pathway assignment in hip and knee replacement surgery: the use of patient-reported outcome measures

 

HST

FIGS

LR

SVM

RF

XGB

DT

MLP

Accuracy

0.687 ± 0.061

0.784 \(\varvec{\pm }\)0.054

0.734 \(\varvec{\pm }\)0.058

0.751 \(\varvec{\pm }\)0.057

0.800 \(\varvec{\pm }\)0.052

0.798 \(\varvec{\pm }\)0.053

0.712 \(\varvec{\pm }\)0.059

0.724 ± 0.022

Sensitivity

0.623 ± 0.037

0.820 ± 0.029

0.741 ± 0.033

0.756 ± 0.033

0.811 ± 0.029

0.994 \(\varvec{\pm }\)0.006

0.695 ± 0.035

0.750 ± 0.025

Specificity

0.866 \(\varvec{\pm }\)0.043

0.716 ± 0.059

0.715 ± 0.057

0.736 ± 0.056

0.721 ± 0.056

0.255 ± 0.055

0.757 ± 0.054

0.658 ± 0.044

Balanced accuracy

0.744 ± 0.000*

0.768 \(\varvec{\pm }\)0.001*

0.728 ± 0.001*

0.746 ± 0.001

0.766 \(\varvec{\pm }\)0.001

0.625 ± 0.001

0.726 ± 0.001

0.704 ± 0.001*

PPV

0.928 \(\varvec{\pm }\)0.024

0.878 ± 0.026

0.878 ± 0.027

0.888 ± 0.026

0.897 ± 0.024

0.787 ± 0.028

0.888 ± 0.027

0.847 ± 0.022

NPV

0.454 ± 0.046

0.580 ± 0.058

0.500 ± 0.053

0.522 ± 0.053

0.600 ± 0.056

0.938 \(\varvec{\pm }\)0.058

0.474 ± 0.050

0.511 ± 0.041

AUC

0.812 ± 0.001

0.813 \(\varvec{\pm }\)0.001*

0.811 ± 0.001*

0.809 ± 0.001

0.812 ± 0.000*

0.807 ± 0.000*

0.794 ± 0.001

0.759 ± 0.001*

F1

0.745 ± 0.029

0.848 \(\varvec{\pm }\)0.021

0.804 ± 0.024

0.817 \(\varvec{\pm }\)0.024

0.858 \(\varvec{\pm }\)0.021

0.851 \(\varvec{\pm }\)0.018

0.780 ± 0.026

0.796 ± 0.019

Brier

0.176 \(\varvec{\pm }\)0.052

0.177 \(\varvec{\pm }\)0.026

0.184 \(\varvec{\pm }\)0.041

0.145 \(\varvec{\pm }\)0.076

0.161 \(\varvec{\pm }\)0.027

0.170 \(\varvec{\pm }\)0.024

0.180 \(\varvec{\pm }\)0.075

0.193 \(\varvec{\pm }\)0.12

sNB

0.574 ± 0.001*

0.728 \(\varvec{\pm }\)0.000*

0.638 ± 0.001*

0.661 ± 0.001

0.727 ± 0.001

0.724 ± 0.001

0.608 ± 0.001

0.615 ± 0.001*

MCC

0.352 \(\varvec{\pm }\)0.049

0.391 \(\varvec{\pm }\)0.049

0.372 \(\varvec{\pm }\)0.049

0.365 \(\varvec{\pm }\)0.049

0.426 \(\varvec{\pm }\)0.049

0.350 \(\varvec{\pm }\)0.049

0.328 ± 0.049

0.383 \(\varvec{\pm }\)0.049

HC accuracy

0.729 \(\varvec{\pm }\)0.058

0.772 \(\varvec{\pm }\)0.055

0.771 \(\varvec{\pm }\)0.055

0.776 \(\varvec{\pm }\)0.055

0.820 \(\varvec{\pm }\)0.050

0.722 \(\varvec{\pm }\)0.059

0.785 \(\varvec{\pm }\)0.054

0.740 ± 0.022

HC sensitivity

0.631 ± 0.037

0.716 ± 0.034

0.697 ± 0.035

0.701 ± 0.035

0.742 ± 0.033

0.985 \(\varvec{\pm }\)0.009

0.759 ± 0.033

0.684 ± 0.027

HC specificity

0.944 \(\varvec{\pm }\)0.029

0.901 \(\varvec{\pm }\)0.048

0.936 \(\varvec{\pm }\)0.031

0.937 \(\varvec{\pm }\)0.031

0.964 \(\varvec{\pm }\)0.020

0.335 ± 0.060

0.854 ± 0.045

0.846 ± 0.034*

HC PPV

0.961 \(\varvec{\pm }\)0.018

0.963 \(\varvec{\pm }\)0.018

0.960 \(\varvec{\pm }\)0.016

0.960 \(\varvec{\pm }\)0.016

0.982 \(\varvec{\pm }\)0.011

0.686 ± 0.032*

0.934 ± 0.021

0.893 ± 0.019*

HC NPV

0.539 ± 0.046

0.580 ± 0.058

0.582 ± 0.052

0.591 ± 0.052

0.658 ± 0.054

0.938 \(\varvec{\pm }\)0.058

0.565 ± 0.050

0.587 ± 0.04

Coverage

0.567 \(\varvec{\pm }\)0.065

0.669 \(\varvec{\pm }\)0.062

0.501 ± 0.065

0.501 ± 0.065

0.501 ± 0.065

0.501 ± 0.065

0.651 \(\varvec{\pm }\)0.062

0.501 ± 0.025

  1. For each metric and and model, an asterisk (*) denotes that the performance of that model on the external validation dataset was significantly worse than on the internal validation dataset
  2. Asterisk denotes a significant difference between the two cohorts, at the 95% confidence level