http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Contribution to Improve Database Classification Algorithms for Multi-Database Mining
Salim Miloudi,Sid Ahmed Rahal,Salim Khiat 한국정보처리학회 2018 Journal of information processing systems Vol.14 No.3
Database classification is an important preprocessing step for the multi-database mining (MDM). In fact,when a multi-branch company needs to explore its distributed data for decision making, it is imperative toclassify these multiple databases into similar clusters before analyzing the data. To search for the bestclassification of a set of n databases, existing algorithms generate from 1 to (n2–n)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification aresubsets of clusters in the next classification), existing algorithms generate each classification independently,that is, without taking into account the use of clusters from the previous classification. Consequently, existingalgorithms are time consuming, especially when the number of candidate classifications increases. Toovercome the latter problem, we propose in this paper an efficient approach that represents the problem ofclassifying the multiple databases as a problem of identifying the connected components of an undirectedweighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of ouralgorithm against existing works and that it overcomes the problem of increase in the execution time.
Contribution to Improve Database Classification Algorithms for Multi-Database Mining
Miloudi, Salim,Rahal, Sid Ahmed,Khiat, Salim Korea Information Processing Society 2018 Journal of information processing systems Vol.14 No.3
Database classification is an important preprocessing step for the multi-database mining (MDM). In fact, when a multi-branch company needs to explore its distributed data for decision making, it is imperative to classify these multiple databases into similar clusters before analyzing the data. To search for the best classification of a set of n databases, existing algorithms generate from 1 to ($n^2-n$)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification are subsets of clusters in the next classification), existing algorithms generate each classification independently, that is, without taking into account the use of clusters from the previous classification. Consequently, existing algorithms are time consuming, especially when the number of candidate classifications increases. To overcome the latter problem, we propose in this paper an efficient approach that represents the problem of classifying the multiple databases as a problem of identifying the connected components of an undirected weighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of our algorithm against existing works and that it overcomes the problem of increase in the execution time.