Human versus artificial intelligence‐generated arthroplasty literature: A single‐blinded analysis of perceived communication, quality, and authorship source
03 medical and health sciences
0302 clinical medicine
Artificial Intelligence
Communication
Humans
Authorship
Language
Arthroplasty
DOI:
10.1002/rcs.2621
Publication Date:
2024-02-13T09:39:50Z
AUTHORS (6)
ABSTRACT
AbstractBackgroundLarge language models (LLM) have unknown implications for medical research. This study assessed whether LLM‐generated abstracts are distinguishable from human‐written abstracts and to compare their perceived quality.MethodsThe LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI‐generated) based on full‐text manuscripts, which were compared to originally published abstracts (human‐written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship‐confidence scores were compared to a test value representing complete inability to discern authorship.ResultsModestly increased confidence in human authorship was observed for human‐written abstracts compared with AI‐generated abstracts (p = 0.028), though AI‐generated abstract authorship‐confidence scores were statistically consistent with inability to discern authorship (p = 0.999). Overall abstract quality was higher for human‐written abstracts (p = 0.019).ConclusionsAI‐generated abstracts' absolute authorship‐confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human‐written abstracts. Caution is warranted in implementing LLMs into scientific writing.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (24)
CITATIONS (11)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....