Assessing the Performance of Chat Generative Pretrained Transformer (ChatGPT) in Answering Andrology-Related Questions.

Caglar, Ufuk; Yildiz, Oguzhan; Fırat Ozervarli, Muhammet; Aydin, Reşat; Sarilar, Omer; Ozgor, Faruk; Ortac, Mazhar

doi:10.5152/tud.2023.23171

Assessing the Performance of Chat Generative Pretrained Transformer (ChatGPT) in Answering Andrology-Related Questions.

Caglar U., Yildiz O., Fırat Ozervarli M. F., Aydin R., Sarilar O., Ozgor F., ...Daha Fazla

Urology research & practice, cilt.49, sa.6, ss.365-369, 2023 (Hakemli Dergi)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 49 Sayı: 6
Basım Tarihi: 2023
Doi Numarası: 10.5152/tud.2023.23171
Dergi Adı: Urology research & practice
Derginin Tarandığı İndeksler: TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.365-369
Anahtar Kelimeler: Andrology, artificial intelligence, information sources
İstanbul Üniversitesi Adresli: Evet

Özet

Objective: The internet and social media have become primary sources of health information, with men frequently turning to these platforms before seeking professional help. Chat generative pretrained transformer (ChatGPT), an artificial intelligence model developed by OpenAI, has gained popularity as a natural language processing program. The present study evaluated the accuracy and reproducibility of ChatGPT's responses to andrology-related questions. Methods: The study analyzed frequently asked andrology questions from health forums, hospital websites, and social media platforms like YouTube and Instagram. Questions were categorized into topics like male hypogonadism, erectile dysfunction, etc. The European Association of Urology (EAU) guideline recommendations were also included. These questions were input into ChatGPT, and responses were evaluated by 3 experienced urologists who scored them on a scale of 1 to 4. Results: Out of 136 evaluated questions, 108 met the criteria. Of these, 87.9% received correct and adequate answers, 9.3% were correct but insufficient, and 3 responses contained both correct and incorrect information. No question was answered completely wrong. The highest correct answer rates were for disorders of ejaculation, penile curvature, and male hypogonadism. The EAU guideline-based questions achieved a correctness rate of 86.3%. The reproducibility of the answers was over 90%. Conclusion: The study found that ChatGPT provided accurate and reliable answers to over 80% of andrology-related questions. While limitations exist, such as potential out-dated data and inability to understand emotional aspects, ChatGPT's potential in the health-care sector is promising. Collaborating with health-care professionals during artificial intelligence model development could enhance its reliability.