Skip to main content

Table 4 Unresponsiveness of GPT-3.5 and GPT-4

From: Large-scale identification of social and behavioral determinants of health from clinical notes: comparison of Latent Semantic Indexing and Generative Pretrained Transformer (GPT) models

  

GPT-3.5

GPT-4

SBDH Category

N

% Disagreement

% No Response

% No Response

Housing insecurity

48

6.3%

0.0%

0.0%

Tobacco use

52

3.8%

15.4%

3.8%

Opiate abuse

42

7.1%

0.0%

0.0%

Alcohol abuse

41

2.4%

0.0%

0.0%

Cocaine abuse

51

0.0%

2.0%

0.0%

Physical & sexual abuse

39

2.6%

5.1%

0.0%

Financial insecurity

30

6.7%

30.0%

0.0%

Legal circumstances

27

22.2%

0.0%

0.0%

Financial circumstances

22

13.6%

4.5%

0.0%

  1. On a set of shared patient-documents (N), GPT-3.5 was prompted five independent times, whereas GPT-4 was prompted only once. The % of documents where GPT-3.5 or GPT-4 did not provide a response is indicated for each SBDH category. The % disagreement corresponds to the number of documents where GPT-3.5 provided conflicting binary responses