http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Breast Cancer Diagnosis Using Cluster-based Undersampling and Boosted C5.0 Algorithm
Jue Zhang,Li Chen,Jian-xue Tian,Fazeel Abid,Wusi Yang,Xiao-fen Tang 제어·로봇·시스템학회 2021 International Journal of Control, Automation, and Vol.19 No.5
Learning from imbalanced data set is relatively new challenge for breast cancer diagnosis, where the diseases cases are often quite rare relative to normal population. Although traditional algorithms are all accuracyoriented which result biased towards the majority class. The combinations of sampling methods with ensemble classifiers have shown certainly good performance. In this paper, a hybrid of cluster-based undersampling and boosted C5.0 is proposed. The proposed classification model consists of two phases: cluster analysis and classification. In cluster analysis, affinity propagation algorithm is used to define the number of clusters, and then the k-means clustering is utilized to select the border and informative samples. In the classification phase, C5.0 algorithm is used in conjunction with boosting technical, owing to leverage the strength of the individual classifiers. The proposed algorithm is assessed by 14 benchmark imbalanced data sets taken from UCI dataset repository. The extensive experimental results on different imbalanced datasets demonstrated that the proposed algorithm can achieve better classification performance in terms of Matthews’ Correlation Coefficient (MCC) as compared to other existing imbalanced dataset classification algorithms.