- Dicle Üniversitesi Mühendislik Fakültesi Dergisi
- Cilt: 16 Sayı: 1
- Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversa...
Bankruptcy Prediction with Optuna-Enhanced Ensemble Machine Learning Methods: A Comparison of Oversampling and Undersampling Techniques
Authors : Vahid Sinap
Pages : 97-113
Doi:10.24012/dumf.1597564
View : 51 | Download : 123
Publication Date : 2025-03-26
Article Type : Research Paper
Abstract :Bankruptcy prediction is an essential task in financial risk management, often hindered by challenges such as class imbalance, feature selection, and overfitting. This study investigates the comparative effectiveness of data balancing techniques, specifically focusing on oversampling with SMOTE (Synthetic Minority Over-sampling Technique) and undersampling with Tomek Links, in addressing class imbalance in bankruptcy datasets. A range of machine learning models, including ensemble and boosting algorithms such as Stacking Classifier and XGBoost, were applied to imbalanced, SMOTE-balanced, and Tomek Links-balanced datasets. Dimensionality reduction was performed using Principal Component Analysis (PCA) to enhance computational efficiency and reduce overfitting risks, while hyperparameter optimization was conducted using the Optuna framework to maximize model performance. The findings demonstrate that SMOTE significantly improved classification accuracy and F1 scores, particularly for ensemble-based models, by generating synthetic samples to balance the dataset. In contrast, Tomek Links often reduced model performance due to the removal of potentially informative data points. Among the models tested, the Stacking Classifier performed best on SMOTE-balanced data, achieving a prediction accuracy of 99%. These results support integrating advanced predictive tools into financial decision-making. The Stacking Classifier’s strong performance on SMOTE-balanced data enhances risk management systems, enabling proactive bankruptcy detection.Keywords : İflas tahmini, veri dengeleme teknikleri, SMOTE, topluluk makine öğrenmesi, Optuna