From: Multimodal machine learning for language and speech markers identification in mental health
Modality - Model | Features # | Accuracy | AUC-ROC | F1 - 0s | F1 - 1s |
---|---|---|---|---|---|
Text - SVM | 20 | 86.77% | 93.33% | 0.91 | 0.78 |
Audio - SVM | 20 | 68.61% | 61.58% | 0.81 | 0.12 |
Multimodal - SVM | 20t, 10a | 86.71% | 92.74% | 0.92 | 0.80 |
Text - RF | 25 | 85.72% | 91.75% | 0.90 | 0.73 |
Audio - RF | 15 | 71.83% | 75.20% | 0.82 | 0.40 |
Multimodal - RF | 20t, 10a | 80.87% | 86.39% | 0.87 | 0.63 |
Text - LogReg | 20 | 87.82% | 92.44% | 0.91 | 0.79 |
Audio - LogReg | 10 | 68.62% | 59.12% | 0.80 | 0.21 |
Multimodal - LogReg | 20t, 10a | 86.73% | 89.55% | 0.90 | 0.80 |
Text - FCNN | 25 | 84.11% | 91.79% | 0.89 | 0.74 |
Audio - FCNN | 10 | 69.70% | 68.23% | 0.80 | 0.46 |
Multimodal - FCNN | 20t, 15a | 84.59% | 89.55% | 0.90 | 0.76 |