- Aksaray Üniversitesi Tıp Bilimleri Dergisi
- Cilt: 6 Sayı: 1
- Gemini 2.5 Pro and ChatGPT-5 on the ATLS Exam: Accuracy, Consistency, and Comparison with Physicians
Gemini 2.5 Pro and ChatGPT-5 on the ATLS Exam: Accuracy, Consistency, and Comparison with Physicians
Authors : Kamil Kokulu, Ekrem Taha Sert, Hüseyin Mutlu, Emin Hüseyin Akar, Muhammed Ali Topuz, Mustafa Önder Gönen, Oğuz Yürük
Pages : 12-17
View : 57 | Download : 165
Publication Date : 2026-01-19
Article Type : Research Paper
Abstract :Abstract Objective: This study aimed to evaluate the accuracy and consistency performance of the current large language models (LLMs), Gemini 2.5 Pro and ChatGPT-5, on the Advanced Trauma Life Support (ATLS) exam. It also aimed to compare these two artificial intelligence (AI) models with emergency medicine residents and examine their performance on different question types. Materials and Methods: This observational study used the 2023 ATLS exam, consisting of 40 multiple-choice questions. Questions were categorized as either directly based on basic knowledge or scenario-based. Each question was administered six times to Gemini 2.5 Pro and ChatGPT-5, and once to six emergency medicine residents to measure response consistency. The accuracy rates for all examinees were calculated and compared. Results: On the ATLS exam, Gemini 2.5 Pro achieved an overall accuracy rate of 95.8%, ChatGPT-5 achieved 92.9%, and residents achieved 67.1%. The AI models performed significantly better than residents (p < 0.001). No significant difference was found between the exam performances of Gemini and ChatGPT (p = 0.17). Both models showed lower accuracy on scenario-based questions compared to knowledge questions. The AI models\\\' response consistency across repeated exams was found to be moderate. Conclusion: Both Gemini 2.5 Pro and ChatGPT-5 passed the ATLS exam with a higher success rate and consistent performance than residents. These findings demonstrate the significant potential of LLMs as a tool for assisting in trauma education, providing rapid access to information, and potentially in clinical decision support mechanisms.Keywords : İleri Travma Yaşam Desteği, yapay zeka, ChatGpt, Gemini, büyük dil modeli
ORIGINAL ARTICLE URL
