Skip to main content

Table 2 Performance of LSI predictions of SDBH categories

From: Large-scale identification of social and behavioral determinants of health from clinical notes: comparison of Latent Semantic Indexing and Generative Pretrained Transformer (GPT) models

  

PPV of LSI Predictions

SBDH Category (Keyword query)

Predicted N

Top 10

Median 10

Bottom 10

Average

Tobacco use (Smokes)

2195

100%

90%

80%

90%

Alcohol abuse (EtOH)

1080

100%

100%

100%

100%

Opiate abuse (Opiate)

444

100%

60%

50%

70%

Cocaine abuse (Cocaine)

1852

100%

70%

40%

70%

Housing insecurity (Homeless)

470

100%

80%

70%

83%

Physical & sexual abuse (Abused)

121

80%

50%

30%

53%

Financial insecurity (Unemployed)

809

100%

90%

100%

97%

Legal circumstances (Legal)

1052

80%

50%

20%

50%

Financial circumstances (Financial)

402

100%

60%

90%

83%

Compliance (Noncompliant)

402

100%

100%

90%

97%

Mobility issues (Walker)

3235

90%

100%

90%

93%

Lack of English proficiency (Interpreter)

1621

100%

90%

80%

90%

Caregiver dependency (Caretaker)

443

100%

90%

60%

83%

Suicidal ideation (Suicide)

1090

100%

60%

40%

67%

Lack of transportation (Transportation)

452

60%

70%

70%

67%

  1. The terms in parentheses indicate the query word used to rank all patients in the dataset