http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
New Splitting Criteria for Classification Trees
Lee, Yung-Seop 한국통계학회 2001 Communications for statistical applications and me Vol.8 No.3
Decision tree methods is the one of data mining techniques. Classification trees are used to predict a class label. When a tree grows, the conventional splitting criteria use the weighted average of the left and the right child nodes for measuring the node impurity. In this paper, new splitting criteria for classification trees are proposed which improve the interpretablity of trees comparing to the conventional methods. The criteria search only for interesting subsets of the data, as opposed to modeling all of the data equally well. As a result, the tree is very unbalanced but extremely interpretable.
Testing and Adjustment for Inhomogeneity Temperature Series Using the SNHT Method
Lee, Yung-Seop,Kim, Hee-Kyung,Lee, Jung-In,Lee, Jae-Won,Kim, Hee-Soo 한국통계학회 2012 응용통계연구 Vol.25 No.6
Data quality and climate forecasting performance deteriorates because of long climate data contaminated by non-climatic factors such as the station relocation or new instrument replacement. For a trusted climate forecast, it is necessary to implement data quality control and test inhomogeneous data. Before the inhomogeneity test, a reference series was created by $d$ index to measure the temperature series relationship between the candidate and surrounding stations. In this study, a inhomogeneity test to each season and climatological station was performed on the daily mean temperatures, daily minimum temperatures and daily maximum temperatures. After comparing two inhomogeneity tests, the traditional and the adjusted SNHT method, we found the adjusted SNHT method was slightly superior to the traditional one.
A Comparison of Clustering Algorithm in Data Mining
Lee, Yung-Seop,An, Mi-Young 한국데이터정보과학회 2003 한국데이터정보과학회지 Vol.14 No.4
To provide the information needed to make a decision, it is important to know the relationship or pattern between variables in database. Grouping objects which have similar characteristics of pattern is called as cluster analysis, one of data mining techniques. In this study, it is compared with several partitioning clustering algorithms, based on the statistical distance or total variance in each cluster.
태양광 발전량의 1일 예보 성능평가를 위한 다양한 시계열 모델 비교 분석
이영섭(Yung-Seop Lee),진대현(Daehyun Jin),김동희(Donghee Kim),김창기(Chang Ki Kim),김현구(Hyun-Goo Kim) 한국신재생에너지학회 2021 한국신재생에너지학회 학술대회논문집 Vol.2021 No.7
시계열 모델 기반의 태양광 발전량 예보모델은 연간 신재생설비 운영비를 절감하기 위한 핵심적인 기술이며 예보의 정확성이 무엇보다 중요하다. 본 연구에서는 태양광 발전량에 대한 Day-ahead(1일 선행) 예보 성능평가를 위한 다양한 시계열 모델을 구축하여 그 성능을 비교분석 하였다. 이를 위하여 우리나라 5개 지역에서의 태양광 발전량의 데이터를 이용하였으며, 예보 선행시간으로는 10시 발표 예보기준으로 38시간, 17시 발표 예보기준으로 31시간 예보 선행시간을 고려하였다. 또한 시계열 모형으로는 관측지역별 ARIMA모형과 관측지역별 계절성을 고려한 SARIMA 모형, 지점의 군집화를 통한 군집별 VAR 모형을 구축하였으며, 모델 성능 평가지표로는 RMSE와 nRMSE(max)를 사용하였다. 그 결과 10시 발표 예보기준, 17시 발표 예보기준과 대부분의 지역에서 ARIMA 모형의 예측오차가 다른 시계열모형보다 예측오차가 낮아서 예측력이 높음을 확인하였다.