- Balkan Journal of Electrical and Computer Engineering
- Cilt: 13 Sayı: 2
- Phishing E-mail Detection with Machine Learning and Deep Learning: Improving Classification Performa...
Phishing E-mail Detection with Machine Learning and Deep Learning: Improving Classification Performance with Proposed New Features
Authors : Hadjer Brioua, Havvanur Siyambaş, Durmuş Özkan Şahin
Pages : 183-193
Doi:10.17694/bajece.1490596
View : 189 | Download : 496
Publication Date : 2025-06-30
Article Type : Research Paper
Abstract :Today, with the increasing use of the internet, individuals who use email have become potential targets for fraudsters. These malicious groups send fake or misleading emails to steal sensitive information such as identity, bank, and social media credentials. This tactic is known as phishing. This study proposes a machine learning-based system for detecting phishing attacks using the SeFACED dataset, which was adjusted for binary classification with 12,498 normal and 5,142 fraudulent email data points. Python was used for programming, with Google Colab and Jupyter Notebook as development platforms. Email data underwent data collection, cleaning, and word stem separation processes. Three feature extraction techniques were used: Bag of Words, TF-IDF, and Word2Vec. Six algorithms, including Logistic Regression, Random Forest, Support Vector Machines, Naive Bayes, Convolutional Neural Network, and Long Short-Term Memory, were employed for classification. Performance was evaluated using metrics like accuracy, preci-sion, recall, and F1-score. New attributes proposed to enhance detection included CSS tags, HTML tags, black-list words, link errors, and grammar and spelling errors. The addition of these features generally improved classification results.Keywords : Phishing, Phishing e-mail, Phishing attacks, Machine learning, Deep learning, Classification, Phishing e-mail classification
ORIGINAL ARTICLE URL
