Breast Cancer Diagnosis by Exploiting the Permutations of Principal Components by Ensemble Classification
Keywords:
Machine learning, Ensemble classification, Principal component analysis, set theory, Data samplingAbstract
In many breast cancer computer-aided diagnosis problems with larger feature dimensions and fewer feature instances, the classification does not get optimal training. This is because a decision boundary is represented by the number of parameters directly proportional to the feature dimensions. Since the optimal training of such high-dimensional features requires a large training set. Unluckily, if the training set is not sufficiently large to generate good n/l ratio, the training results in an ineffective and inefficient classification model. To resolve the problem of large dimensions, the conventional employment of feature reduction techniques results in efficient training however it yields the degraded classification performance. In this paper, we consider this problem to have effective and efficient training in large dimensional datasets when the available dataset is not sufficiently large. For this purpose, we hybridize principal component analysis with ensemble classification. For this, different combinations of principal dimensions have been determined by the concept of power sets in mathematics. A dedicated base learner then exploits each principal dimension combination. Then, all these base learners are combined to construct a hybrid ensemble principal component analysis-based classifier, Ens-PCA. The proposed Ens-PCA technique is tested using Wisconsin diagnostic breast cancer (WDBC) data set and the results show its outperformance as compared to the contemporary principal component analysis and ensemble classification techniques.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License