A Multi-Model Machine Learning Approach for Reliable Breast Cancer Classification
DOI:
https://doi.org/10.56979/1002/2026/1294Keywords:
Breast Cancer Classification, Machine Learning, Ensemble Learning, XGBoost, Random Forest, Principal Component Analysis, ROC-AUC, Feature Importance, Medical DiagnosisAbstract
Breast cancer is a critical health issue that requires early diagnosis in order to enhance clinical outcomes. This research paper is a proposal of a detailed machine learning system to the classification of breast tumors based on the morphological characteristics of the fine-needle aspiration photos. The workflow combines exploratory data analysis, feature standardization, Principal Component Analysis (PCA), and the relative comparison of seven kinds of supervised classifiers, including Logistic Regression, SVM, KNN, Decision Tree, Random Forest, Gradient Boosting and XGBoost. Exploratory data mining showed that there are strong correlations between geometric features and there are evident distributional differences between benign and malignant tumors. PCA affirmed that the attributes that prevail in the structure of variances are tumor size and concavity-related attributes. The comparative results showed that the model performance was always very high and the best results were obtained by the boosting-based ensemble methods in the accuracy, ROC-AUC and Precision-Recall results. The importance of features analysis revealed consistent key morphological predictors in models. The results show that linear models have a competitiveness, but ensemble models have better robustness and reliability in the classification of breast cancer.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License



