First Author (year) | Outcome Variable | Predictors (Input variables) | ML technique | Cross-validation method (internal, external) | Type | Program used | Best algorithm performance |
---|---|---|---|---|---|---|---|
Acion (2017) [36] | Substance abuse treatment success | 28; 10 patient characteristics, 3 treatment factors, referral type, problematic substance characteristics, and mental health problem | LR, RLR, Lasso-LR, EN, RF, DNN, EL | Two-fold cross-validation (I) | Classification | R; H2O R interface and package rROC | AUC: 0.793–0.820 Best mode: EL |
Augsburger (2017) [31] | Risk-taking behavior as measured using a balloon analog risk task (BART) | Exposure to different types of childhood maltreatment, experiences of war and torture, lifetime traumatic events and symptoms of depression and PTSD, sociodemographic factors | Stochastic GBM | Tenfold cross-validation with three repetitions (I) | Regression | R; gbm & caret | RMSE: 18.70, R^2: 0.20, |
Baird (2022) [35] | Psychological trauma as measured on the GHQ-12 | 18 digitally coded features in self-portraits and free drawings | One model method used: LASSO-R | K-fold cross-validation (I) | Regression | Not reported | R-squared: 0.108 |
Castilla-Puentes (2021) [41] | Tone, topics, and attitude of digital conversations | Digital conversations | NLP and texting mining | Not used | Unsupervised- Topic modeling | CulturIntel | Not reported |
Choi (2020) [32] | Psychological distress is measured using the Kessler Psychological Distress Scale (K10) | Demographic characteristics, three types of discrimination characteristics, three types of coping mechanisms | ANN | Not used | Classification | SPSS | AUC: 0.806 |
Drydakis (2021) [33] | Increased level of integration, overall health, and mental health | Number of mobile applications in use that facilitate immigrants’ societal integration | Linear Regression | Not used | Regression | Not reported | p < 0.005 |
Erol (2022) [34] | Symptom severity of depression and PTSD | Demographic data, PTSD and depression levels, access to food and education, and changes in family income | Linear regression | Not used | Regression | SPSS | R-squared = 0.123 |
Goldstein (2022) [37] | Suicidal ideation in the past year | Experience of discrimination, demographics | Deep-learning NLP algorithms and LR | Not used | Classification | Not reported | Not reported |
Haroz (2020) [39] | Suicide attempts, measured at 6, 12, and 24 months after an initial suicide-related event | 73; demographic characteristics, educational history, past mental health, substance use, living status, history of domestic violence, participation in tribal activities, knowing anyone who died by suicide in their lifetime, and number of indexed events | RF, SVM, Lasso-R, RLR | Repeated cross-validation with 10 iterations (I) | Classification | Not reported | AUC: 0.87 |
Huber (2020) [38] | Migrant status | 653 variables | LR, DTs, SVM, and naive Bayes | 5-fold cross-validation (I) | Classification | Not reported | DT Accuracy: 74.5%; AUC: 0.75 |
Khatua (2021) [40] | Tweets that fall into 3 themes: generic views, initial struggles, and subsequent settlement | Tweets | Bi-LSTM, CNN, BERT | Training and testing | Classification | Python | F1-Score: 61.61–75.89% |
Liu (2021) [43] | MH diagnosis from EHR | Copy number variation | Multi-layer perceptron | Two-fold random shuffle test validation (I) | Classification | Python; Scikit-learn package | Accuracy: 65.7% |
Liu (2021) [42] | ADHD diagnosis | Copy number variation | Multi-layer perceptron | Two-fold random shuffle test validation (E) | Classification | Python; Scikit-learn package | Accuracy: 75.4% |