The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset

Home Page
About
Submit A Journal
Submit A Conference
Submit Paper/Book
- Submit a Preprint
- Submit a Book
Contact

Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi
Cilt: 14 Sayı: 4
The effect of text representation and model selection on classification performance: A comprehensive...

The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset

Authors : Muhammet Sinan Başarslan, Fatih Bal

Pages : 1447-1461

Doi:10.28948/ngumuh.1694988

View : 72 | Download : 106

Publication Date : 2025-10-15

Article Type : Research Paper

Abstract :This study evaluates the performance of various machine learning (ML) models on a dataset split into 80% training and 20% testing using Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BoW) text vectorization. Transformer-based models like DistilBERT, RoBERTa, and alBERT were integrated with classical ML algorithms and ensemble methods such as Stacking, Hard Voting, and Soft Voting. Stacking achieved the highest performance with both methods—92.62% Accuracy (Acc) and 92.51% F1-score (F1) with TF-IDF, and 92.29% Acc and 92.41% F1 with BoW. Hard Voting with BoW yielded the highest Recall (95.23%). Classical models like Logistic Regression (LR) and Support Vector Machine (SVM) performed better with BoW, reaching 90.98% and 90.51% Acc, respectively. Overall, TF-IDF produced balanced outcomes, while BoW offered higher Recall and Precision in specific cases. These results highlight the significance of both model and text representation choices in achieving optimal classification performance.
Keywords : Sahte haber, ML, Metin Gösterimi, Önceden eğitilmiş

ORIGINAL ARTICLE URL

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.

Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2026