Leveraging SHAP for Interpretable Diabetes Prediction: A Study of Machine Learning Models on the Pima Indians Diabetes Dataset

Home Page
About
Submit A Journal
Submit A Conference
Submit Paper/Book
- Submit a Preprint
- Submit a Book
Contact

Balkan Journal of Electrical and Computer Engineering
Cilt: 13 Sayı: 2
Leveraging SHAP for Interpretable Diabetes Prediction: A Study of Machine Learning Models on the Pim...

Leveraging SHAP for Interpretable Diabetes Prediction: A Study of Machine Learning Models on the Pima Indians Diabetes Dataset

Authors : İsmail Kırbaş, Ahmet Çifci

Pages : 128-139

Doi:10.17694/bajece.1577929

View : 104 | Download : 221

Publication Date : 2025-06-30

Article Type : Research Paper

Abstract :This paper investigates the application of machine learning (ML) models for predicting diabetes using the Pima Indians Diabetes Database, with a focus on enhancing model interpretability through the use of SHapley Additive exPlanations (SHAP). The study evaluates eight ML models, including Adaptive Boosting (AdaBoost), k-Nearest Neighbors (k-NN), Logistic Regression (LR), Multi-layer Perceptron (MLP), Naive Bayes (NB), Random Forest (RF), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost), utilizing both test/train split and 10-fold cross-validation methods. The RF model demonstrated superior performance, achieving an accuracy of 82% and an F1-score of 0.83 in the test/train split, and an accuracy of 83% and an F1-score of 0.84 in the 10-fold cross-validation. SHAP analysis was employed to identify the most influential predictors, revealing that glucose, BMI, pregnancies, and insulin levels are the key factors in diabetes prediction, aligning with established clinical markers. Additionally, the use of the Synthetic Minority Over-sampling TEchnique (SMOTE) for class balancing and data scaling contributes to robust model performance. The study emphasizes the necessity for interpretable ML in healthcare, proposing SHAP as a valuable tool for bridging predictive accuracy and clinical transparency in diabetes diagnostics.
Keywords : Diabetes Prediction, Explainable Artificial Intelligence, Machine Learning Models, Model Interpretability, SHapley Additive exPlanation

ORIGINAL ARTICLE URL

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.

Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2026