RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
        • 발행연도
          펼치기
        • 작성언어

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI우수등재

        다변량 경시적 자료에서 공분산 행렬의 AR구조와 ARMA구조의 비교

        윤단비(Danbi Yun),이근백(Keunbaik Lee) 한국데이터정보과학회 2020 한국데이터정보과학회지 Vol.31 No.5

        다변량 경시적 자료에서 반복 측정된 자료들 사이에는 응답변수들 간의 세 가지 형태의 상관관계가 존재한다: 다른 시점에서 다른 반응변수들 간의 상관관계, 다른 시점에서의 동일한 반응변수들 간의 상관관계, 그리고 같은 시점에서의 반응변수들 간의 상관관계. 따라서 다변량 경시적 자료분석에서는 이러한 상관관계들을 모두 가지는 공분산행렬을 고려하여 모형화하는 것이 중요하다. 하지만 이러한 공분산행렬은 양정치성 (positive definiteness)을 만족해야 하고, 때로는 이분산성 (heteroge-neous)을 가질 수 있다. 또한 반복 측정 횟수가 증가함에 따라 공분산행렬의 모수의 수는 기하급수적으로 증가하여 추정하기가 쉽지 않다. 이 어려움들을 해결하기 위해 자기회귀 (autoregressive) 구조, 자기회귀-이동평균 (autoregressive-moving average) 구조를 가지는 공분산 행렬의 모형화 방법이 제안되었다. Lee 등 (2020)과 Lee 등 (2019)은 다변량 경시적 자료분석에서 각각 자기회귀 구조와 자기회귀-이동평균 구조의 공분산 행렬을 분해방법을 제안하였고, 또한 이러한 분석방법으로 추정된 공분산행렬은 항상 양정치성을 만족하고, 이분산성을 가질 수 있다. 본 논문에서는 이 두 방법을 모의실험을 통하여 서로 비교하고자 한다. In multivariate longitudinal data, there are three correlations: correlation within separate responses over time, cross-correlation between response at different times, and correlation between responses at each time point. Therefore, it is important to model the covariance matrix with the correlations. However, the covariance matrix for multivariate longitudinal data must be positive definite and the number of parameters in the covariance matrix increases exponentially as dimension increases. In order to solve the difficulties, the modeling of the covariance matrix with an autoregressive (AR) structure and an autoregressive moving average (ARMA) structure are proposed. Lee et al. (2020) proposed decomposition method assuming covariance matrix with autoregressive structure in multivariate longitudinal data analysis. Lee et al. (2019) extended Lee et al.’s (2020) method to accommodate long series of multivariate longitudinal data using autoregressive-moving average covariance matrix. In this paper, we compare these two methods through simulations.

      • On fused dimension reduction in multivariate regression

        Lee, Keunbaik,Choi, Yuri,Um, Hye Yeon,Yoo, Jae Keun Elsevier 2019 Chemometrics and intelligent laboratory systems Vol.193 No.-

        <P><B>Abstract</B></P> <P>High-dimensional data analysis often suffers the so-called curse of dimensionality, and various data reduction methods are adopted in order to avoid it in practice. Consequently, in multivariate regression, high-dimensional predictors should be reduced to lower-dimensional ones without the loss of information, following a notion of sufficient dimension reduction. In this paper, a fused clustered seeded reduction approach is proposed for multivariate regression. The proposed method utilizes two types of information: supervised learning between the responses and the predictors, and unsupervised learning of the predictors alone. Fusing all the information has a potential advantage in the accuracy of the reduction of predictors. Numerical studies and a real data analysis confirm the practical usefulness of the proposed approach over existing methods.</P> <P><B>Highlights</B></P> <P> <UL> <LI> A multivariate fused clustered seeded reduction (MFCSR) is proposed for multivariate regression. The fused approach combines all information obtained from supervised and unsupervised statistical methods for the regression, and the kernel matrices for the dimension reduction are fused. </LI> <LI> The <I>K</I>-means clustering algorithm, which is an unsupervised learning method, clusters either the predictors or its principal components. From the unsupervised learning, unknown structures of the predictors can be discovered. Within each cluster, the covariance matrix of the response and the predictors, which is a supervised learning method, is constructed, and they are fused for the dimension reduction of the predictors. Then, they are successively projected onto the marginal covariance matrix of the predictors. This method is called <I>multivariate fused clustered seeded reduction.</I> </LI> <LI> Numerical studies confirm the theory for MFCSR, and the proposed MFCSR outperforms the existing method in the dimension reduction of the predictors. The application of MFCSR to near infrared spectroscopy data shows its potential advantage in practice. </LI> <LI> The proposed fused clustered seeded reduction approach is proposed by utilizing information acquired by supervised and unsupervised learning methods. Then, a process of fusing all information has potential benefit in more informative predictor reduction over the existing method. Various numerical studies support the proposed approach, and a real data application to near infrared spectroscopy data confirms its practical usefulness. </LI> </UL> </P> <P><B>Graphical abstract</B></P> <P>[DISPLAY OMISSION]</P>

      • KCI등재후보

        Autoregressive Cholesky Factor Modeling for Marginalized Random Effects Models

        Lee, Keunbaik,Sung, Sunah The Korean Statistical Society 2014 Communications for statistical applications and me Vol.21 No.2

        Marginalized random effects models (MREM) are commonly used to analyze longitudinal categorical data when the population-averaged effects is of interest. In these models, random effects are used to explain both subject and time variations. The estimation of the random effects covariance matrix is not simple in MREM because of the high dimension and the positive definiteness. A relatively simple structure for the correlation is assumed such as a homogeneous AR(1) structure; however, it is too strong of an assumption. In consequence, the estimates of the fixed effects can be biased. To avoid this problem, we introduce one approach to explain a heterogenous random effects covariance matrix using a modified Cholesky decomposition. The approach results in parameters that can be easily modeled without concern that the resulting estimator will not be positive definite. The interpretation of the parameters is sensible. We analyze metabolic syndrome data from a Korean Genomic Epidemiology Study using this method.

      • Flexible marginalized models for bivariate longitudinal ordinal data

        Lee, Keunbaik,Daniels, Michael J.,Joo, Yongsung Oxford University Press 2013 Biostatistics Vol.14 No.3

        <P>Random effects models are commonly used to analyze longitudinal categorical data. Marginalized random effects models are a class of models that permit direct estimation of marginal mean parameters and characterize serial correlation for longitudinal categorical data via random effects (Heagerty, 1999). Marginally specified logistic-normal models for longitudinal binary data. <I>Biometrics</I> <B>55</B>, 688–698; Lee and Daniels, 2008. Marginalized models for longitudinal ordinal data with application to quality of life studies. <I>Statistics in Medicine</I> <B>27</B>, 4359–4380). In this paper, we propose a Kronecker product (KP) covariance structure to capture the correlation between processes at a given time <I>and</I> the correlation within a process over time (serial correlation) for bivariate longitudinal ordinal data. For the latter, we consider a more general class of models than standard (first-order) autoregressive correlation models, by re-parameterizing the correlation matrix using partial autocorrelations (Daniels and Pourahmadi, 2009). Modeling covariance matrices via partial autocorrelations. <I>Journal of Multivariate Analysis</I> <B>100</B>, 2352–2363). We assess the reasonableness of the KP structure with a score test. A maximum marginal likelihood estimation method is proposed utilizing a quasi-Newton algorithm with quasi-Monte Carlo integration of the random effects. We examine the effects of demographic factors on metabolic syndrome and C-reactive protein using the proposed models.</P>

      • KCI등재후보

        Bayesian Modeling of Random Effects Covariance Matrix for Generalized Linear Mixed Models

        Lee, Keunbaik The Korean Statistical Society 2013 Communications for statistical applications and me Vol.20 No.3

        Generalized linear mixed models(GLMMs) are frequently used for the analysis of longitudinal categorical data when the subject-specific effects is of interest. In GLMMs, the structure of the random effects covariance matrix is important for the estimation of fixed effects and to explain subject and time variations. The estimation of the matrix is not simple because of the high dimension and the positive definiteness; subsequently, we practically use the simple structure of the covariance matrix such as AR(1). However, this strong assumption can result in biased estimates of the fixed effects. In this paper, we introduce Bayesian modeling approaches for the random effects covariance matrix using a modified Cholesky decomposition. The modified Cholesky decomposition approach has been used to explain a heterogenous random effects covariance matrix and the subsequent estimated covariance matrix will be positive definite. We analyze metabolic syndrome data from a Korean Genomic Epidemiology Study using these methods.

      • KCI우수등재

        Modeling of random effects covariance matrix in marginalized random effects models

        Lee, Keunbaik,Kim, Seolhwa The Korean Data and Information Science Society 2016 한국데이터정보과학회지 Vol.27 No.3

        Marginalized random effects models (MREMs) are often used to analyze longitudinal categorical data. The models permit direct estimation of marginal mean parameters and specify the serial correlation of longitudinal categorical data via the random effects. However, it is not easy to estimate the random effects covariance matrix in the MREMs because the matrix is high-dimensional and must be positive-definite. To solve these restrictions, we introduce two modeling approaches of the random effects covariance matrix: partial autocorrelation and the modified Cholesky decomposition. These proposed methods are illustrated with the real data from Korean genomic epidemiology study.

      • KCI우수등재

        Modeling of random effects covariance matrix in marginalized random effects models

        Keunbaik Lee,Seolhwa Kim 한국데이터정보과학회 2016 한국데이터정보과학회지 Vol.27 No.3

        Marginalized random effects models (MREMs) are often used to analyze longitudinal categorical data. The models permit direct estimation of marginal mean parameters and specify the serial correlation of longitudinal categorical data via the random effects. However, it is not easy to estimate the random effects covariance matrix in the MREMs because the matrix is high-dimensional and must be positive-definite. To solve these restrictions, we introduce two modeling approaches of the random effects covariance matrix: partial autocorrelation and the modified Cholesky decomposition. These proposed methods are illustrated with the real data from Korean genomic epidemiology study.

      • KCI우수등재

        다변량 t-선형모형을 이용한 조세재정 패널데이터 분석

        구동현(Donghyun Koo),이근백(Keunbaik Lee) 한국데이터정보과학회 2022 한국데이터정보과학회지 Vol.33 No.1

        경시적 자료는 여러 분야에서 많이 수집되어 왔다. 하지만 대부분의 경시적 연구들은 다변량 반응변수를 가짐에도 불구하고, 이들 반응변수들을 각각의 단변량으로 한정하여 분석하였다. 이러한 이유로 다변량 경시적 자료에서 가지는 다른 반응변수 간의 상관관계를 올바르게 추정할 수 없었고, 이로 인하여 평균모수에 대한 추정량에 편향이 생길 수 있었다. 이를 해결하기 위하여 다변량 경시적 자료를 분석하기 위한 모형들이 제안되었다. 이 논문에서 다변량 경시적 자료분석을 위한 대표적인 모형인 Lee 등 (2020a)의 다변량 선형모형과 Rhee (2020)의 다변량 t 선형모형을 이용하여 조세패널 자료를 분석한다. 조세패널 자료는 다변량 경시적 자료이며, 반응변수로 세액절감금액, 연간소비금액, 연간저축금액을 고려하였다. 이 응답변수들은 가처분소득에 기반한 가계의 주된 재정건전성 관련 지표들로 이들 간의 상관관계를 고려한 다변량 선형모형 및 t 선형모형을 적합한 결과, 두 모형의 적합결과가 연구문헌과는 일부 다른 결과가 도출되었다. 이는 본 패널자료의 경시적 자료구조를 충실히 반영하기 위해 다변량 경시적 모형을 사용했기 때문이며, 나아가 t 선형모형이 오차항을 t-분포를 가정함으로써 이상치에 강건한 특성을 지닌다는 것에 연유한다. 소득공제의 관점에서 반응변수가 세액절감금액일때 총소득공제금액, 소득공제비율, 고소득가구여부와의 연관성에 초점을 맞추어 모형을 해석하였으며, 특히 조세재정분야의 정책제언 측면에서 가구의 소득 규모에 따라 소득공제제도의 운용을 달리함으로써 가계의 세액절감효과를 유도할 수 있을 것이라는 결론을 도출하었다. 세액절감 금액이 반응변수일때 유의미한 공변량들로는 총소득공제금액, 소득공제 비율, 고소득가구여부, 교호작용, 교육수준이 있었으며, 연간소비금액이 반응변수일때에는 자가여부를 제외한 모든 공변량들이 유의하였다. 연간저축금액의 경우 소득공제비율을 제외한 모든 공변량이 유의한 것으로 나타났다. Most of the longitudinal studies have analyzed data treating as univariate longitudinal data, even though they were multivariate longitudinal data. For this reason, the estimates of mean parameters can be biased because the univariate longitudinal analysis ignores the correlations between other outcomes To solve this problem, multivariate linear models have been proposed (Lee et al., 2020a; Rhee, 2020). In this paper, tax panel data are analyzed using the multivariate linear model and the multivariate t linear models. In the data, retrenched amount of tax liability, annual household expenditure, and annual household savings were considered as response variables, which are the main indicators of financial soundness of households based on disposable income. From the perspective of income deduction, the model was interpreted focusing on the relationship between total income deduction amount, income deduction ratio, and high-income households, and it was concluded that the household tax reduction effect could be induced by varying the household’s income size. When the retrenched amount of tax liability was a response variable, significant covariates included total income deduction amount, income deduction ratio, high-income household status, interaction, and education level. In addition, most of the covariates were significant in the case of annual expenditure and annual savings.

      • KCI우수등재

        다변량 선형모형을 이용한 노동패널자료 분석

        서리브가(Rebecca Suh),이근백(Keunbaik Lee) 한국데이터정보과학회 2020 한국데이터정보과학회지 Vol.31 No.4

        다변량 경시적 자료는 의학, 보건과학, 사회과학, 환경연구 등과 같은 많은 분야에서 측정된다. 이 자료는 시간에 따라 여러 개의 반응변수들이 반복적으로 측정되기 때문에 복잡한 상관관계를 가지고 있다. 즉 다른 시점에서의 동일한 반응변수들 간의 상관관계, 같은 시점에서의 서로 다른 반응변수들 간의 상관관계, 그리고 다른 시점에서의 서로 다른 반응변수들 간의 상관관계를 가지며, 이러한 복잡한 상관관계로 인해 다변량 경시적 자료에 대해 공분산행렬을 모형화하는 것은 단변량 경시적 자료분석에 비해 더 어렵다. 본 논문에서는 다변량 경시적 자료에 대한 공분산행렬을 모형화하는 것에 대한 여러 가지 접근법을 조사하고, 이 방법들 중에 해석이 용이한 Kim과 Zimmerman (2012)과 Lee 등 (2020)의 방법을 이용하여 실제 다변량 경시적 자료인 노동패널자료를 분석하고자 한다. Multivariate longitudinal data are measured in many areas, such as medicine, health science, social science, environmental research. The data have complex correlations because multiple response variables are measured repeatedly over time: correlation within separate responses over time, cross-correlation between response at different times, and correlation between responses at each time point. Thus, modeling covariance matrix for multivariate longitudinal data is more difficult than univariate longitudinal data analysis. In this paper, we will investigate various approaches of the modeling covariance matrix for multivariate longitudinal data, and analyze the actual multivariate longitudinal data, labor panel data, using methods such as Kim and Zimmerman (2012) and Lee et al. (2020), which are easy to interpret among these methods.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼