Optimizing Malicious Website Detection with the XGBoost Machine Learning Approach

Fazal Malik; Muhammad Suliman; Muhammad Qasim Khan; Noor Rahman; Khairullah Khan; Muhammad Khan

Authors

Fazal Malik Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.
Muhammad Suliman Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.
Muhammad Qasim Khan Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.
Noor Rahman Department of Computer science and Engineering, AL- Fayha College, 6480 Al Fayha, Al Jubayl 31961, Saudi Arabia.
Khairullah Khan Department of Computer Science, University of Science & Technology, Bannu, Khyber Pakhtunkhwa (KPK), Pakistan.
Muhammad Khan Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.

Keywords:

Malicious Websites, Cyber Security, Types of Malicious Entities, XGBoost Algorithm, Prediction

Abstract

The rising threat of malicious websites demands advanced detection methods for robust cybersecurity. Traditional approaches, such as rule-based systems and machine learning models like Random Forest and Support Vector Machine (SVM), often struggle to balance precision and recall. This research introduces an innovative methodology using the XGBoost algorithm to detect malicious URLs. The study follows a four-step approach: (1) Dataset Acquisition—utilizing the "Malicious Website URLs" dataset from Kaggle; (2) Data Preprocessing—including data cleaning, feature selection, and transformation to optimize model training; (3) Model Implementation—applying XGBoost, an ensemble learning algorithm known for its superior performance, to train the model on the preprocessed dataset; and (4) Model Evaluation—assessing performance through metrics such as accuracy, precision, recall, and F1-score. The results show that XGBoost achieves 88.89% precision and 86.6% accuracy, outperforming conventional methods and offering a balanced trade-off between precision and recall. This research highlights the significance of precise feature selection and model optimization, reducing human intervention and enhancing cybersecurity defenses. The findings demonstrate XGBoost's effectiveness in minimizing false positives and negatives, making it a valuable addition to existing cybersecurity frameworks. This study underscores the critical role of advanced machine learning techniques and accurate feature selection in strengthening defenses against evolving cyber threats.

Optimizing Malicious Website Detection with the XGBoost Machine Learning Approach

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

SCOPUS

SCOPUS Q3

HJRS

ISSN

Online First

Call for Papers

Make a Submission

Open Access

Information

Conference

SC-2