Optimizing Malicious Website Detection with the XGBoost Machine Learning Approach
Keywords:
Malicious Websites, Cyber Security, Types of Malicious Entities, XGBoost Algorithm, PredictionAbstract
The rising threat of malicious websites demands advanced detection methods for robust cybersecurity. Traditional approaches, such as rule-based systems and machine learning models like Random Forest and Support Vector Machine (SVM), often struggle to balance precision and recall. This research introduces an innovative methodology using the XGBoost algorithm to detect malicious URLs. The study follows a four-step approach: (1) Dataset Acquisition—utilizing the "Malicious Website URLs" dataset from Kaggle; (2) Data Preprocessing—including data cleaning, feature selection, and transformation to optimize model training; (3) Model Implementation—applying XGBoost, an ensemble learning algorithm known for its superior performance, to train the model on the preprocessed dataset; and (4) Model Evaluation—assessing performance through metrics such as accuracy, precision, recall, and F1-score. The results show that XGBoost achieves 88.89% precision and 86.6% accuracy, outperforming conventional methods and offering a balanced trade-off between precision and recall. This research highlights the significance of precise feature selection and model optimization, reducing human intervention and enhancing cybersecurity defenses. The findings demonstrate XGBoost's effectiveness in minimizing false positives and negatives, making it a valuable addition to existing cybersecurity frameworks. This study underscores the critical role of advanced machine learning techniques and accurate feature selection in strengthening defenses against evolving cyber threats.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License