        공간적 자기상관 통계량의 고유벡터 간 비교 연구

        이상일(Sang-Il Lee),조대헌(Daeheon Cho),이민파(Minpa Lee) 대한지리학회 2017 대한지리학회지 Vol.52 No.5

        본 연구의 주된 목적은, 상이한 공간적 자기상관 통계량(모런 통계량, 기어리 통계량, S <SUP>*</SUP> 통계량)과 상이한 공간근접성행렬(이항연접성행렬과 행표준화행렬)로부터 추출된 고유벡터의 공간 패턴을 체계적으로 비교함으로써 고유벡터의 다양성에 대한 일반론을 정립하고, 이러한 고유벡터의 다양성이 고유벡터공간필터링 접근에 대해 갖는 함의를 실 데이터를 통해 검토하는 것이다. 고유벡터간 일치도 평가를 위해 일종의 상관관계 매트릭스 그래프가 사용되었고, 대각선성과 대응성이라는 두 가지 규준에 의거해 해석되었다. 이와 관련된 결과를 요약하면 다음과 같다. 첫째, 동일한 공간적 자기상관 통계량에 상이한 공간근접성행렬을 적용한 결과 상당히 이질적인 고유벡터의 세트가 추출되었다. 둘째, 상이한 공간적 자기상관 통계량 쌍 간에 일치성의 정도와 양상에서 상당한 차이가 있는 것으로 드러났고, 공간근접성행렬의 효과도 현저한 것으로 나타났다. 고유벡터의 다양성이 공간적 회귀분석에 가지는 함의를 분석하기 위해 푸에르토리코의 경험 데이터에 대해 6개의 서로 다른 ESF 모형을 실행하였다. 세 가지의 기본적인 사항이 관찰되었다. 첫째, 모형 별로 다양한 개수의 고유벡터가 다양한 순위의 조합으로 선정되어 투입된다. 둘째, 투입된 고유벡터의 종류에 따라 ESF 모형이 잔차의 공간적 자기상관을 제거하는 능력이 달라진다. 셋째, 회귀계수의 크기와 유의성이 모형별로 상당한 차이를 보인다. 이러한 기본적인 결과를 바탕으로 두 가지 함의가 도출되었다. 첫째, 기본적으로 잔차의 공간적 자기상관을 가장 잘 제거하는 모형이 가장 우수하다고 말할 수 있다. 둘째, 회귀계수의 크기와 유의성을 비공간적인 기본 모형과 비교하고, 그것을 바탕으로 상이한 ESF 모형들을 평가하는 것이 가능하다. The main objective of this study is to elucidate the source and aspects of the variability of eigenvectors by comparing the spatial patterns of different eigenvectors in association with different spatial autocorrelation statistics (Moran’s I, Geary’s c, and Lee’s S<SUP>*</SUP>) and/or different spatial proximity matrices (binary contiguity-based and row-standardized) and, based on this, to discuss some potential implications for the eigenvector spatial filtering modeling. A modified form of the correlation matrix graph is used as a visual analytic and two criteria, diagonality and correspondence, are set to evaluate the degree of coincidence between two sets of eigenvectors. Regarding this, two things are observed: (1) the spatial proximity matrix matters even when the same spatial autocorrelation statistic is concerned; (2) different spatial autocorrelation statistics and different spatial proximity matrices are jointly responsible for the variability of spatial eigenvectors. In order to draw some implications of the variability of spatial eigenvectors for the eigenvector spatial filtering approach, six different ESF models are established for the Puerto Rico agricultural data. Regarding this, three things are observed: (1) different numbers and compositions of eigenvectors are selected for the models; (2) different models due to the different eigenvector input have different ability to control spatial autocorrelation in residuals; (3) the magnitude and significance of the regression coefficients vary among the models. Based on these, two implications are drawn. First, a better model should remove spatial autocorrelation in residuals better. Second, another criterion can be set based on a comparison between ESF models and the basic model in terms of the magnitude and significance of regression coefficients

      • KCI등재

        일변량 공간연관성통계량에 대한 비교 연구 (1): 전역적 S 통계량을 중심으로

        이상일 ( Sang Il Lee ),조대헌 ( Daeheon Cho ),이민파 ( Minpa Lee ) 한국지리학회 2015 한국지리학회지 Vol.4 No.2

        The main objective of this paper is to elucidate the characteristics of a new spatial association statistic, S, in comparison with Moran`s I and Geary`s c. A general statistic, S, is defined and two derivative statistics, S0 and S*, are subsequently proposed with a strong guidance for an exclusive use of the latter. S* is defined as a rate of the variance of one variable`s spatial moving average vector to the original variance suggesting that the presence of a strong positive spatial autocorrelation results in a smaller reduction in variance which leads to a higher S* with a culminating point of 1 in a theoretical sense. In order to examine the properties of S*, two methods are introduced; one is to derive the first four central moments and the other is to extract eigenvalues and eigenvectors. The former aims at determine the distributional characteristics of the statistics, and the latter seeks to the obtain their feasible ranges. Regular tessellations of triangles, squares, and hexagons with three different sample sizes (64, 256, 1,024) are generated and used for an investigation. The smallest administrative spatial units for the 7 big cities in South Korea are also utilized to examine the practical research implications. The major findings are twofold. First, S*, unlike other spatial association statistics, turns out to yield a constant feasible range of zero to one regardless of different spatial unit shapes, different contiguity types, and different spatial proximity matrices. This is the most important merit of the statistic convincing its usability. Second, the skewness and kurtosis of S* are considerably deviant form the norms with even a large sample size such that the normal approximation based on the first two moments may not be valid. This is the most important defect of the statistic precipitating the use of more advanced significance testing procedures.

      • KCI등재

        일변량 공간연관성통계량에 대한 비교 연구 (II): 국지적 S<sub>i</sub> 통계량을 중심으로

        이상일 ( Sang-il Lee ),조대헌 ( Daeheon Cho ),이민파 ( Minpa Lee ) 한국지리학회 2016 한국지리학회지 Vol.5 No.3

        The main objective of this paper is to elucidate the characteristics of a new spatial association statistic, S<sub>i</sub>, in comparison with local Moran`s I<sub>i</sub>, local Geary`s c<sub>i</sub>, and Getis-Ord`s G<sub>i</sub>ㆍS<sub>i</sub> as a new local statistic measures how much (and in what direction) a local set contributes to an overall variance reduction resulting from the smoothing effect which occurs when an original variable is represented by its spatial moving average; the presence of a strong positive spatial autocorrelation in a local set results in a higher S<sub>i</sub> and vice versa. The main findings of a comparison of the four local spatial association statistics on the basis of four main concepts (spatial clusters, spatial outliers, spatial regimes, and local stability) and two additional criteria are two fold. First, they are largely divided into two distinct categories, I<sub>i</sub> and c<sub>i</sub> as more association-centered ones, and  S<sub>i</sub>and G<sub>i</sub> as more clustering-centered ones. Second, S<sub>i</sub> can be seen as a substitute for G<sub>i</sub> in the sense that the former satisfies the two conditions for a LISA and its distributional properties are better known. A regular tessellation analysis is conducted to examine the feasible ranges and distributional properties of the local spatial association statistics. Major findings are as follows. First, correlations among the local statistics are much smaller in amount and their feasible ranges are much narrower when compared to their global counterparts. Second, the feasible ranges and distributional properties vary in accordance to the number and shape of spatial units, the number of neighboring spatial units, and the specification of spatial proximity matrix. Third, both the skewness and kurtosis are much more pronounced when compared to those from global SAS such that normal approximation is proved to be much less reliable for significance testing. Fourth, differences in spatial configuration of the 7 largest cities in South Korea dictate differences in the feasible ranges and distributional characteristics. This study can be viewed as one of the most comprehensive studies to address different pros and cons of the local statistics and is expected to help researchers choose a statistic suitable for their empirical studies.

      • KCI우수등재

        피어슨 상관계수의 공간화

        이상일(Sang-Il Lee),조대헌(Daeheon Cho),이민파(Minpa Lee) 대한지리학회 2018 대한지리학회지 Vol.53 No.5

        본 연구는 두 변수 간의 상관성을 측정하는데 지배적인 통계기법으로 사용되어 온 피어슨 상관계수를 공간화하는 방식에 대해 다루고 있다. 이변량 공간적 자기상관이 존재할 경우, 피어슨 상관계수값과 그것에 대한 유의성 검정 결과가 갖는 통계학적 의미는 훼손될 수 밖에 없다. 본 연구는 이변량 상관관계에서의 공간적 자기상관의 문제를 해결하기 위해 제시된 세 가지 연구 기법(수정 t-검정, 공간필터 상관계수, 이변량 공간적 자기상관 통계량)에 대한 상세한 리뷰를 제공하고, 다소 독립적으로 발전해 온 세 기법이 얼마나 일관성 있는 결과를 보여주는지를 실험 연구를 통해 살펴보고자 했다. 주요 결과는 다음의 두 가지로 요약된다. 첫째, 몇몇 예외를 제외한다면, 세 가지 접근법의 결과는 상당한 정도의 상호 일관성을 갖는 것으로 나타났다. 즉, L*에 의거해 높은 이변량 공간적 자기상관을 보여주는 패턴 쌍일수록 공간필터 상관계수와 유효표본크기(자유도)는 작은 반면, 유의확률은 높게 나타났다. 둘째, L*와 가장 일관성 있는 결과를 보여준 것은 고유벡터공간필터링(ESF, eigenvector spatial filtering) 기법에 기반한 공간필터 상관계수 기법이었다. 즉, L*가 커질수록 공간필터 상관계수가 감소하는 거의 완벽한 경향성을 보여주었다. 본 연구의 가장 큰 의미는 피어슨 상관계수가 본질적으로 비공간적인 통계량임을 명확히 하고, 이 문제점를 해결하기 위해 제안되어 온 세 접근법이 개별적 특성에도 불구하고 일관성 있는 결과를 보여준다는 점을 실험 연구를 통해 밝혔다는 점이다. This study deals with spatializing the Pearson’s correlation coefficient as a dominant statistical technique for measuring and assessing bivariate relationships. With the presence of bivariate spatial autocorrelation in a pair of variables under investigation, not only Pearson’s correlation coefficients themselves but their statistical significance are deemed to be questionable. This study provides a comprehensive review on the three different approaches to the problem of spatial autocorrelation in the bivariate correlation (modified t-test, spatially filtered correlation coefficients, and bivariate spatial autocorrelation statistics), and examines how compatible the results from the three different camps might be by conducting a simulation experiment. The main findings are twofold. First, with some exceptional cases, the three approaches are quite correspondent to one another in terms of experimental results; the higher the degree of bivariate spatial autocorrelation as measured by L*, the lower the spatially filtered correlation coefficients, the smaller the effective sample size, and the higher the p-values. Second, the most compatible results are found between L* and the spatially filtered correlation coefficients based on the eigenvector spatial filtering (ESF) approach; there is an almost perfect negative relationship between the statistics and the correlation coefficients. The major contribution of this study to spatializing the Pearson’s statistic lies in reaffirming that the statistic is aspatial in nature and in clarifying in an experimental simulation that the three different approaches yield consistent results to some extent.

