Fig. 1

Relative feature importance for treatment selection with SGLT2-inhibitor and DPP4-inhibitor treatment, for all clinical features included in model development. a Penalized regression. Feature importance reflects the proportion of chi-squared explained by drug-by-covariate interaction terms for each clinical feature in multivariable analysis, as these represent differential treatment effects for the two therapies. Bars represent bootstrapped 95% confidence intervals. b Causal forest model. Adjusted importance (using p-values) represent the covariates selected most often by trees within the causal forest, after controlling for biased variable selection. Permutation-based tests generate p-values for each covariate, using an understanding that spurious splits in trees would continue to occur in the presence of a permuted outcome unless these splits also reflect the true underlying association. For the purpose of comparison, inverse p-values are presented as relative importance measures