- Genel Tıp Dergisi
- Cilt: 35 Sayı: 4
- Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot...
Cross-Linguistic Evaluation of Artificial Intelligence Chatbots: Performance of ChatGPT-3.5, Copilot and Gemini in Neuro-ophthalmologic Evaluation in English and Turkish
Authors : Eyüpcan Şensoy, Mehmet Çıtırık
Pages : 597-604
Doi:10.54005/geneltip.1627508
View : 140 | Download : 83
Publication Date : 2025-08-29
Article Type : Research Paper
Abstract :Abstract Background/Aims: To evaluate the performance of ChatGPT-3.5, Copilot, and Gemini artificial intelligence chatbots on the same questions in neuro-ophthalmologic evaluation in English and Turkish. Methods: Forty questions related to neuro-ophthalmology were included in the study. After all English questions were translated into Turkish by a certified native speaker, both versions of the questions were asked to ChatGPT-3.5, Copilot, and Gemini chatbots. The answers were compared with the answer key and grouped as correct and incorrect. Their superiority over each other was compared statistically. Results: ChatGPT-3,5 47.5%, Copilot 57.5%, and Gemini 32.5% answered the English questions correctly. ChatGPT-3,5 57.5%, Copilot 52.5%, and Gemini 32.5% answered the questions correctly in Turkish. No statistically significant difference was detected between chatbots in answering the same questions in English and Turkish, although there were different levels of success (p>0.05). Conclusions: Although there is no statistically significant difference, chatbots can answer the same questions differently. In addition to improving the knowledge level of chatbots, their language skills also need to be improved.Keywords : ChatGPT-3.5, Copilot, Gemini, İngilizce, Nöro-oftalmoloji, Türkçe, Yapay zeka uygulamaları
ORIGINAL ARTICLE URL
