http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Diagnostic measures for kernel ridge regression on reproducing kernel Hilbert space
김충락,양호진 한국통계학회 2019 Journal of the Korean Statistical Society Vol.48 No.3
The aim of this paper is to define and develop diagnostic measures with respect to kernel ridge regression in a reproducing kernel Hilbert space (RKHS). To identify influential observations, we define a particular version of Cook’s distance for the kernel ridge regression model in RKHS, which is conceptually consistent with Cook’s distance in a classical regression model. Then, by using the perturbation formula for the regularized conditional expectation of the outcome in RKHS, we develop an approximate version of Cook’’s distance in RKHS because the original definition requires intensive computations. Such an approximated Cook’’s distance is represented in terms of basic building blocks such as residuals and leverages of the kernel ridge regression. The results of the simulation and real application demonstrate that our diagnostic measure successfully detects potentially influential observations on estimators in kernel ridge regression.
Nonparametric estimation of quantile functions for randomly right censored data
김충락,Soonphill Hong,김진미 한국통계학회 2013 Journal of the Korean Statistical Society Vol.42 No.2
In this paper we compare four nonparametric quantile function estimators for randomly right censored data: the Kaplan–Meier estimator, the linearly interpolated Kaplan–Meier estimator, the kernel-type survival function estimator, and the Bézier curve smoothing estimator. Also, we compare several kinds of confidence intervals of quantiles for four nonparametric quantile function estimators.
A local linear estimation of conditional hazard function in censored data
김충락,Minkyung Oh,양성준,최혜미 한국통계학회 2010 Journal of the Korean Statistical Society Vol.39 No.3
A local linear estimator of the conditional hazard function in censored data is proposed. The estimator suggested in this paper is motivated by the ideas of Fan, Yao, and Tong (1996)and Kim, Bae, Choi, and Park (2005). The asymptotic distribution of the proposed estimator is derived, and some numerical results are also provided.
Case influence diagnostics in the lasso regression
김충락,이정수,양호진,배화수 한국통계학회 2015 Journal of the Korean Statistical Society Vol.44 No.2
Using the diagnostic results in the ridge regression model, we propose an approximate version of Cook’s distance in the lasso regression model since the analytic expression of the lasso estimator is not available. Also, we express the proposed Cook’s distance in terms of basic building blocks such as residuals and leverages. We verify that the proposed statistic successfully detects potentially influential observations on estimators of regression coefficients and on the model selection in the lasso regression model. An illustrative example based on a real dataset is given.
베지에 곡선을 이용한 함수의 미분에 대한 비모수적 추정
김충락(Choong Rak Kim),정미선(Mee Seon Jeong),김형순(Hyoung Soon Kim) 한국통계학회 1998 응용통계연구 Vol.11 No.1
주어진 자료를 회귀모형에 적합시켜 적합된 함수의 미분을 구해야 하는 경우가 흔히 있다. 본 논문에서는 베지에 곡선을 이용하여 비모수적으로 추정하는 방법을 소개하고, 실제 자료에 적용시킨다. 이 방법의 장점은 원하는 차수의 미분이 가능할 뿐만 아니라, 비모수 추정에 따르는 커널의 선택과정이 필요없고 단지 평활모수만 선택하면 된다. It is quite that we have to estimate the derivative of the regression function. The Bezier curve, rarely known to statisticians, is very popular in computer graphics area. In this paper, we use nonparametric method via the Bezier curve, and apply this method to real data set. This method seems to be very easy to compute and can be easily applied to other smoothing techniques.
Adjustments of Dispersion Statistics in Extended Quasi-likelihood Models
Choong Rak Kim(김충락),Mee Seon Jeong(정미선) 한국통계학회 1993 응용통계연구 Vol.6 No.1
본 논문에서는 과산포 혼합 모형인 음이항 분포와 베타이항 분포에서 피어슨 형태 및 데비언스 형태의 분산치 교정에 대한 효과를 수리적으로 비교했다. 이들 과산포 혼합 모형은, 평균과 분산을 동시에 모형화 하는데 매우 유용한 준우도함수의 중요한 구성원이다. 모의실험을 통해서 분산치의 교정이 평균, 산포모수에 따라 어떻게 달라지는지 비교 연구하였다. In this paper we study numerical behavior of the adjustments for the variances of the Pearson and deviance type dispersion statistics in two overdispersed mixture models; negative binomial and beta-binomial distribution. They are important families of an extended quasi-likelihood model which is very useful for the joint modelling of mean and dispersion. Comparisons are done for two types of dispersion statistics for various mean and dispersion parameters by simulation studies.
Basic Statistics in Quantile Regression
김재원,김충락 한국통계학회 2012 응용통계연구 Vol.25 No.2
In this paper we study some basic statistics in quantile regression. In particular, we investigate the residual, goodness-of-fit statistic and the effect of one or few observations on estimates of regression coefficients. In addition, we compare the proposed goodness-of-fit statistic with the statistic considered by Koenker and Machado (1999). An illustrative example based on real data sets is given to see the numerical performance of the proposed basic statistics.
김기풍,김충락 한국통계학회 2023 응용통계연구 Vol.36 No.3
In this paper, we introduced distance metric between two subspaces. For this, several matrix norms such as the spectral norm and the Frobenius norm are introduced. Further, the distance between two matrices based on the projection and principal angles are introduced. Finally, its application to the matrix perturbation theory with the famous Davis-Kahan theorem (1970) is illlustrated. 본 논문에서는 두 부공간의 거리측도에 대해 소개하였다. 이를 위해 여러 가지 행렬의 노름을 소개하고 주어진 노름하에서 여러 가지 행렬간의 거리측도를 소개하였으며 이들간의 관계를 설명하였다. 이를 이용하여 행렬의 교란이론에 적용하고 교란이론의 핵심적 결과인 Davis-Kahan (1970) 정리를 소개하였다. 두 부공간의 거리측도는 통계학의 다양한 분야 뿐만 아니라 후방탐색, 안면인식 등 인공지능의 중요한 분야에 많이 활용되고 있다.
구간중도절단자료에서 생존함수와 중간생존시간에 대한 추정
윤은영,김충락,Yun, Eun-Young,Kim, Choong-Rak 한국통계학회 2010 응용통계연구 Vol.23 No.3
구간중도절단은 중도절단의 가장 일반적인 개념으로 구간중도절단자료는 의학 및 역학분야의 연구에서 흔히 관찰된다. 본 연구에서는 구간중도절단의 상황에서 생존함수와 중간생존시간을 추정하는 방법으로 평균대치법과 자기일치법을 비교 연구하고, 실제 자료로 혈우병환자에서 선천성면역결핍바이러스 감염시점을 추정하였다. 또한 구간중도절단자료를 생성하는 새로운 방법을 제시하였으며, 생성된 구간중도절단자료를 이용한 모의실험을 통하여 두 추정치에 대한 다양한 비교연구를 시행하였다. 구간중도절단자료에서 생존함수와 중간생존시간을 추정할 경우 중도절단율이 크지 않다면 평균대치법이 자기일치법보다 더 우수한 추정치로 판명되었다. Interval-censored observations are common in medical and epidemiologic studies; however, limited studies exist due to the complexity and special structure of interval-censoring. This paper introduces the imputation method and the self consistency method in the interval-censored data. We propose a new method of generating random numbers under an interval-censoring set-up. Through simulation studies we compare two methods under various simulation schemes in the sense of the mean squared error for estimating the median survival time and the mean integrated squared error for estimating the survival function. Under a moderate censoring percentage, the mean imputation method showed a better performance than the self-consistency method in estimating the median survival time and the survival function.