Optimizing Malicious Website Detection with the XGBoost Machine Learning Approach

Authors

  • Fazal Malik Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.
  • Muhammad Suliman Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.
  • Muhammad Qasim Khan Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.
  • Noor Rahman Department of Computer science and Engineering, AL- Fayha College, 6480 Al Fayha, Al Jubayl 31961, Saudi Arabia.
  • Khairullah Khan Department of Computer Science, University of Science & Technology, Bannu, Khyber Pakhtunkhwa (KPK), Pakistan.
  • Muhammad Khan Department of Computer Science, Iqra National University Peshawar, Khyber Pakhtunkhwa (KPK), Pakistan.

Keywords:

Malicious Websites, Cyber Security, Types of Malicious Entities, XGBoost Algorithm, Prediction

Abstract

The rising threat of malicious websites demands advanced detection methods for robust cybersecurity. Traditional approaches, such as rule-based systems and machine learning models like Random Forest and Support Vector Machine (SVM), often struggle to balance precision and recall. This research introduces an innovative methodology using the XGBoost algorithm to detect malicious URLs. The study follows a four-step approach: (1) Dataset Acquisition—utilizing the "Malicious Website URLs" dataset from Kaggle; (2) Data Preprocessing—including data cleaning, feature selection, and transformation to optimize model training; (3) Model Implementation—applying XGBoost, an ensemble learning algorithm known for its superior performance, to train the model on the preprocessed dataset; and (4) Model Evaluation—assessing performance through metrics such as accuracy, precision, recall, and F1-score. The results show that XGBoost achieves 88.89% precision and 86.6% accuracy, outperforming conventional methods and offering a balanced trade-off between precision and recall. This research highlights the significance of precise feature selection and model optimization, reducing human intervention and enhancing cybersecurity defenses. The findings demonstrate XGBoost's effectiveness in minimizing false positives and negatives, making it a valuable addition to existing cybersecurity frameworks. This study underscores the critical role of advanced machine learning techniques and accurate feature selection in strengthening defenses against evolving cyber threats.

Downloads

Published

2024-09-01

How to Cite

Fazal Malik, Muhammad Suliman, Muhammad Qasim Khan, Noor Rahman, Khairullah Khan, & Muhammad Khan. (2024). Optimizing Malicious Website Detection with the XGBoost Machine Learning Approach. Journal of Computing & Biomedical Informatics, 7(02). Retrieved from https://jcbi.org/index.php/Main/article/view/722