Artificial intelligence large language model ChatGPT: is it a trustworthy and reliable source of information for sarcoma patients?

Quartile Trustworthiness
DOI: 10.3389/fpubh.2024.1303319 Publication Date: 2024-03-22T04:53:03Z
ABSTRACT
Introduction Since its introduction in November 2022, the artificial intelligence large language model ChatGPT has taken world by storm. Among other applications it can be used patients as a source of information on diseases and their treatments. However, little is known about quality sarcoma-related provides. We therefore aimed at analyzing how sarcoma experts evaluate ChatGPT’s responses inquiries assess bot’s answers specific evaluation metrics. Methods The to sample 25 questions (5 definitions, 9 general questions, 11 treatment-related inquiries) were evaluated 3 independent experts. Each response was compared with authoritative resources international guidelines graded 5 different metrics using 5-point Likert scale: completeness, misleadingness, accuracy, being up-to-date, appropriateness. This resulted maximum minimum points per answer, higher scores indicating quality. Scores ≥21 rated very good, between 16 20 while ≤15 classified poor (11–15) (≤10). Results median score that achieved 18.3 (IQR, i.e., Inter-Quartile Range, 12.3–20.3 points). Six each poor. best documented appropriate for (median, 3.7 points; IQR, 2.5–4.2 points), which significantly accuracy 3.3 2.0–4.2 p = 0.035). fared considerably worse only 45% good or (78% good/very good) definitions (60% good). Discussion provided rare disease, such sarcoma, found inconsistent quality, some others Sarcoma physicians should aware risks misinformation poses advise accordingly.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (24)
CITATIONS (10)