Boosting Early Diabetes Detection: An Ensemble Learning Approach with XGBoost and LightGBM

Authors

  • Faheem Mazhar Department of Computer Science, NFC-IET, Multan, Pakistan.
  • Wasif Akbar Department of Computer Science, NFC-IET, Multan, Pakistan.
  • Muhammad Sajid Department of Computer Science, Air University Islamabad, Multan Campus, Multan 60000, Pakistan.
  • Naeem Aslam Department of Computer Science, NFC-IET, Multan, Pakistan.
  • Muhammad Imran Department of Computer Science, NFC-IET, Multan, Pakistan.
  • Haroon Ahmad Department of Computer Science, Air University Islamabad, Multan Campus, Multan 60000, Pakistan.

Keywords:

Machine Learning, Prediction, Diabetes, XGBOOST, Classification, Random Forest(RF)

Abstract

Given the increased prevalence of diabetes, early identification and prognosis of the condition are essential to avoiding long-term health consequences. Diabetes is an enduring medical illness that may have a role in the global health crises. The International Diabetes Federation estimates that 382 million people worldwide have diabetes. This number is expected to double by 2035, to reach 592 million. A medical condition known as diabetes is brought on by an excessively high blood glucose level. Diabetes is the main cause of renal failure, blindness, amputations, heart failure, and stroke. In order to develop a computerised approach for diabetes prediction, this work uses machine learning (ML) techniques on the Pima Indians dataset and private diabetes information. The aim of this project is to combine the findings from multiple machine learning techniques to create a system that can more accurately predict a patient's risk of developing diabetes in their early years. Techniques including logistic regression, SVM, RF, KNN, and decision trees are used. For every algorithm, the model's accuracy is computed. The model that predicts diabetes with the best accuracy is then chosen. We have achieved remarkable results in terms of accuracy, precision, recall, and F1-score for the models on the dataset by utilising several machine learning classifiers and putting feature removal techniques like feature permutation and hierarchical clustering into practice. This suggests that our characteristics or data are not limited to specific models.

Downloads

Published

2024-03-01

How to Cite

Faheem Mazhar, Wasif Akbar, Muhammad Sajid, Naeem Aslam, Muhammad Imran, & Haroon Ahmad. (2024). Boosting Early Diabetes Detection: An Ensemble Learning Approach with XGBoost and LightGBM. Journal of Computing & Biomedical Informatics, 6(02), 127–138. Retrieved from https://jcbi.org/index.php/Main/article/view/347