LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 1 of total 1

Search options

Article ; Online: Artificial intelligence large language model ChatGPT: is it a trustworthy and reliable source of information for sarcoma patients?

Valentini, Marisa / Szkandera, Joanna / Smolle, Maria / Scheipl, Susanne / Leithner, Andreas / Andreou, Dimosthenis

Frontiers in public health

2024  Volume 12, Page(s) 1303319

Abstract: Introduction: Since its introduction in November 2022, the artificial intelligence large language model ChatGPT has taken the world by storm. Among other applications it can be used by patients as a source of information on diseases and their treatments. ...

Abstract Introduction: Since its introduction in November 2022, the artificial intelligence large language model ChatGPT has taken the world by storm. Among other applications it can be used by patients as a source of information on diseases and their treatments. However, little is known about the quality of the sarcoma-related information ChatGPT provides. We therefore aimed at analyzing how sarcoma experts evaluate the quality of ChatGPT's responses on sarcoma-related inquiries and assess the bot's answers in specific evaluation metrics.
Methods: The ChatGPT responses to a sample of 25 sarcoma-related questions (5 definitions, 9 general questions, and 11 treatment-related inquiries) were evaluated by 3 independent sarcoma experts. Each response was compared with authoritative resources and international guidelines and graded on 5 different metrics using a 5-point Likert scale: completeness, misleadingness, accuracy, being up-to-date, and appropriateness. This resulted in maximum 25 and minimum 5 points per answer, with higher scores indicating a higher response quality. Scores ≥21 points were rated as very good, between 16 and 20 as good, while scores ≤15 points were classified as poor (11-15) and very poor (≤10).
Results: The median score that ChatGPT's answers achieved was 18.3 points (IQR, i.e., Inter-Quartile Range, 12.3-20.3 points). Six answers were classified as very good, 9 as good, while 5 answers each were rated as poor and very poor. The best scores were documented in the evaluation of how appropriate the response was for patients (median, 3.7 points; IQR, 2.5-4.2 points), which were significantly higher compared to the accuracy scores (median, 3.3 points; IQR, 2.0-4.2 points;
Discussion: The answers ChatGPT provided on a rare disease, such as sarcoma, were found to be of very inconsistent quality, with some answers being classified as very good and others as very poor. Sarcoma physicians should be aware of the risks of misinformation that ChatGPT poses and advise their patients accordingly.
MeSH term(s) Humans ; Artificial Intelligence ; Language ; Sarcoma ; Awareness ; Information Sources
Language English
Publishing date 2024-03-22
Publishing country Switzerland
Document type Journal Article
ZDB-ID 2711781-9
ISSN 2296-2565 ; 2296-2565
ISSN (online) 2296-2565
ISSN 2296-2565
DOI 10.3389/fpubh.2024.1303319
Database MEDical Literature Analysis and Retrieval System OnLINE

More links

Kategorien

To top