http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Empirical Comparisons of Clustering Algorithms using Silhouette Information
Sunghae Jun,Seung-Joo Lee 한국지능시스템학회 2010 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.10 No.1
Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.
Improvement of SOM using Stratification
Sunghae Jun 한국지능시스템학회 2009 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.9 No.1
Self organizing map(SOM) is one of the unsupervised methods based on the competitive learning. Many clustering works have been performed using SOM. It has offered the data visualization according to its result. The visualized result has been used for decision process of descriptive data mining as exploratory data analysis. In this paper we propose improvement of SOM using stratified sampling of statistics. The stratification leads to improve the performance of SOM. To verify improvement of our study, we make comparative experiments using the data sets form UCI machine learning repository and simulation data.
Support Vector Machine based on Stratified Sampling
Sunghae Jun 한국지능시스템학회 2009 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.9 No.2
Support vector machine is a classification algorithm based on statistical learning theory. It has shown many results with good performances in the data mining fields. But there are some problems in the algorithm. One of the problems is its heavy computing cost. So we have been difficult to use the support vector machine in the dynamic and online systems. To overcome this problem we propose to use stratified sampling of statistical sampling theory. The usage of stratified sampling supports to reduce the size of training data. In our paper, though the size of data is small, the performance accuracy is maintained. We verify our improved performance by experimental results using data sets from UCI machine learning repository.
Patent Statistics for Technology Analysis
Sunghae Jun 보안공학연구지원센터 2015 International Journal of Software Engineering and Vol.9 No.5
Technology analysis is to analyze technology by qualitative and quantitative approaches. This is important issue in management of technology (MOT) because the result of technology analysis is used for efficient R&D planning. In this paper, we propose novel approach to technology analysis by patent statistics. We retrieve patent documents related to target technology, and analyze the retrieved patent data by statistical methods such as visualization and modeling. To verify the performance of our research, we perform a case study using patent data for target domain. Our study contributes to R&D planning, new product development, and technological innovation.
Frequentist and Bayesian Learning Approaches to Artificial Intelligence
Jun, Sunghae Korean Institute of Intelligent Systems 2016 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.16 No.2
Artificial intelligence (AI) is making computer systems intelligent to do right thing. The AI is used today in a variety of fields, such as journalism, medical, industry as well as entertainment. The impact of AI is becoming larger day after day. In general, the AI system has to lead the optimal decision under uncertainty. But it is difficult for the AI system can derive the best conclusion. In addition, we have a trouble to represent the intelligent capacity of AI in numeric values. Statistics has the ability to quantify the uncertainty by two approaches of frequentist and Bayesian. So in this paper, we propose a methodology of the connection between statistics and AI efficiently. We compute a fixed value for estimating the population parameter using the frequentist learning. Also we find a probability distribution to estimate the parameter of conceptual population using Bayesian learning. To show how our proposed research could be applied to practical domain, we collect the patent big data related to Apple company, and we make the AI more intelligent to understand Apple's technology.