An Analysis of Supervised Machine Learning Techniques for Churn Forecasting and Component Identification in the Telecom Sector

Authors

  • Hira Farman Department of Computer Science, Iqra University, Karachi, Pakistan.
  • Abdul Wahab Khan Department of Computer Science, Mohammad Ali Jinnah University, Karachi, Pakistan.
  • Saad Ahmed Iqra University, Karachi, Pakistan.
  • Dodo Khan Department of CS &IT, TIEST, NED University of Engineering & Technology, Karachi, Pakistan.
  • Muhammad Imran Department of Metallurgical Engineering, NED University of Engineering & Technology, Karachi, Pakistan.
  • Priyanka Bajaj Department of Business administration, Salim Habib University, Karachi, Pakistan.

Keywords:

Business Intelligence (BI), Churn, Data Mining, Telecom

Abstract

The term "business intelligence" (BI) refers to a broad range of tools and software intended to collect, process, and evaluate data so that business users may decide on the best course of action. Getting the right information to the right decision-makers at the right time is the main goal of business intelligence (BI). The telecom business creates massive amounts of data every day because of its big clientele. Decision-makers and business professionals emphasized that maintaining existing clientele is less expensive than recruiting new ones. In addition to detecting patterns of behavior from the data on existing attrition clients, business analysts and customer relationship management (CRM) analysts need to understand the reasons for customer attrition. This paper focuses on customer churn, a critical metric that represents the percentage of customers ending their relationship with a company over a specific period. Using detailed datasets and advanced data analysis and machine learning techniques, the key churn predictors were found in this study, along with practical recommendations on how to improve customer retention. In order to create predictive customer churn models, several important machine learning algorithms have been surveyed and compared in this study. This study looks at more than just churn and non-churn classification; it also evaluates the accuracy of different data mining techniques. Uses a variety of performance indicators and confusion matrices to assess the effectiveness of three classification models: Random Forest (RF), Decision Tree (DT), and Logistic Regression (LR). With the best AUC (0.985), F1 score (0.934), Precision (0.935), and MCC (0.830), the Random Forest model outperformed the others, demonstrating a strong balance between recall and precision. The Decision Tree model also performed well, with notable accuracy). Logistic Regression, while effective, showed comparatively lower metrics, with an AUC of 0.848 and an F1 score of 0.800. The confusion matrices further validated these results, highlighting the Random Forest model's robustness and superior classification capabilities. The findings show that with the RF algorithm, our suggested churn prediction model generated superior churn categorization. Furthermore, this research delves into the fundamentals of BI and presents optimization strategies crucial for making dynamic, optimal decisions in today’s corporate landscape.

Downloads

Published

2024-06-01

How to Cite

Hira Farman, Abdul Wahab Khan, , S. A., Dodo Khan, Muhammad Imran, & Priyanka Bajaj. (2024). An Analysis of Supervised Machine Learning Techniques for Churn Forecasting and Component Identification in the Telecom Sector. Journal of Computing & Biomedical Informatics, 7(01), 264–280. Retrieved from https://jcbi.org/index.php/Main/article/view/478