- Kuramsal Eğitimbilim Dergisi
- Cilt: 18 Sayı: 4
- Can AI Assess Writing Skills Like a Human? A Reliability Analysis
Can AI Assess Writing Skills Like a Human? A Reliability Analysis
Authors : Hüseyin Ataseven, Ömay Çokluk Bökeoğlu, Fazilet Taşdemir
Pages : 757-775
Doi:10.30831/akukeg.1718511
View : 388 | Download : 555
Publication Date : 2025-10-28
Article Type : Research Paper
Abstract :This study investigates the reliability and consistency of a custom GPT-based scoring system in comparison to trained human raters, focusing on B1-level opinion paragraphs written by English preparatory students. Addressing the limited evidence on how AI scoring systems align with human evaluations in foreign language contexts, the study provides insights into both strengths and limitations of automated writing assessment. A total of 175 student writings were evaluated twice by human raters and twice by the AI system using analytic rubric. Findings indicate excellent agreement among human raters and high consistency across AI-generated scores, but only moderate alignment between human and AI evaluations, with the AI showing a tendency to assign higher scores and overlook off-topic content. These results suggest that while AI scoring systems offer efficiency and consistency, they still lack the interpretive depth of human judgment. The study highlights the potential of AI as a complementary tool in writing assessment, with practical implications for language testing policy and classroom pedagogy.Keywords : yapay zekâ ile puanlama otomasyonu, yazma değerlendirmesi, değerlendiriciler arası tutarlılık
ORIGINAL ARTICLE URL
