Gemini 2.5 Pro and ChatGPT-5 on the ATLS Exam: Accuracy, Consistency, and Comparison with Physicians

Home Page
About
Submit A Journal
Submit A Conference
Submit Paper/Book
- Submit a Preprint
- Submit a Book
Contact

Aksaray Üniversitesi Tıp Bilimleri Dergisi
Cilt: 6 Sayı: 1
Gemini 2.5 Pro and ChatGPT-5 on the ATLS Exam: Accuracy, Consistency, and Comparison with Physicians

Gemini 2.5 Pro and ChatGPT-5 on the ATLS Exam: Accuracy, Consistency, and Comparison with Physicians

Authors : Kamil Kokulu, Ekrem Taha Sert, Hüseyin Mutlu, Emin Hüseyin Akar, Muhammed Ali Topuz, Mustafa Önder Gönen, Oğuz Yürük

Pages : 12-17

View : 58 | Download : 165

Publication Date : 2026-01-19

Article Type : Research Paper

Abstract :Abstract Objective: This study aimed to evaluate the accuracy and consistency performance of the current large language models (LLMs), Gemini 2.5 Pro and ChatGPT-5, on the Advanced Trauma Life Support (ATLS) exam. It also aimed to compare these two artificial intelligence (AI) models with emergency medicine residents and examine their performance on different question types. Materials and Methods: This observational study used the 2023 ATLS exam, consisting of 40 multiple-choice questions. Questions were categorized as either directly based on basic knowledge or scenario-based. Each question was administered six times to Gemini 2.5 Pro and ChatGPT-5, and once to six emergency medicine residents to measure response consistency. The accuracy rates for all examinees were calculated and compared. Results: On the ATLS exam, Gemini 2.5 Pro achieved an overall accuracy rate of 95.8%, ChatGPT-5 achieved 92.9%, and residents achieved 67.1%. The AI models performed significantly better than residents (p < 0.001). No significant difference was found between the exam performances of Gemini and ChatGPT (p = 0.17). Both models showed lower accuracy on scenario-based questions compared to knowledge questions. The AI models\\\' response consistency across repeated exams was found to be moderate. Conclusion: Both Gemini 2.5 Pro and ChatGPT-5 passed the ATLS exam with a higher success rate and consistent performance than residents. These findings demonstrate the significant potential of LLMs as a tool for assisting in trauma education, providing rapid access to information, and potentially in clinical decision support mechanisms.
Keywords : İleri Travma Yaşam Desteği, yapay zeka, ChatGpt, Gemini, büyük dil modeli

ORIGINAL ARTICLE URL

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.

Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2026