NFDI4DS | UHH-SEMS - Publication Details

Performance of large language models (LLMs) in providing prostate cancer information

Patient Education Grade level

DOI: 10.1186/s12894-024-01570-0 Publication Date: 2024-08-23T22:25:48Z

Abstract Supplemental Material References Cited by

AUTHORS (8)

Ahmed Alasker

Seham Alsalamah

Nada Alshathri

Nura Almansour

Faris Alsalamah

Mohammad Alghafees

Mohammad AlKhamees

Bader Alsaikhan

ABSTRACT

The diagnosis and management of prostate cancer (PCa), the second most common in men worldwide, are highly complex. Hence, patients often seek knowledge through additional resources, including AI chatbots such as ChatGPT Google Bard. This study aimed to evaluate performance LLMs providing education on PCa. Common patient questions about PCa were collected from reliable educational websites evaluated for accuracy, comprehensiveness, readability, stability by two independent board-certified urologists, with a third resolving discrepancy. Accuracy was measured 3-point scale, comprehensiveness 5-point Likert readability using Flesch Reading Ease (FRE) score Flesch–Kincaid FK Grade Level. A total 52 general knowledge, diagnosis, treatment, prevention provided three LLMs. Although there no significant difference overall accuracy LLMs, ChatGPT-3.5 demonstrated superiority over other terms (p = 0.018). ChatGPT-4 achieved greater than Bard 0.028). For generated simpler sentences highest FRE (54.7, p < 0.001) lowest reading level (10.2, 0.001). ChatGPT-3.5, generate accurate, comprehensive, easily readable material. These models might not replace healthcare professionals but can assist guidance.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (28)

CITATIONS (10)

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications CROSSREF - Publications

PlumX Metrics

Performance of large language models (LLMs) in providing prostate cancer information

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....