Skip to main content

Table 5 Post-hoc pairwise comparison of scores in AI chatbots

From: Can artificial intelligence models serve as patient information consultants in orthodontics?

 

Modified DISCERN Score

p

Flesch Reading Ease Score

p

GQS

Score

p

ChatGPT-3.5 vs. ChatGPT-4

0.597

0.371

0.165

ChatGPT-3.5 vs. Gemini

0.023

0.001

0.036

ChatGPT-3.5 vs. Copilot

0.0001

0.758

0.0001

ChatGPT-4 vs. Gemini

0.348

0.017

0.483

ChatGPT-4 vs. Copilot

0.0001

0.919

0.007

Gemini vs. Copilot

0.0001

0.002

0.042

  1. p < 0.05, Tukey’s post hoc test was used for Modified DISCERN and Flesch Reading Ease Score, and Dunn’s post hoc test was used for GQS.