딥러닝 모델의 PM 2.5 예측성능비교 : LSTM, GRU, 양방향 LSTM (서울, 대전, 부산시를 대상으로) = Comparison of PM 2.5 prediction performance of deep learning model : LSTM, GRU, bidirectional LSTM(A case study of Seoul, Daejeon and Busan cities)|RISS 상세보기

국문 초록 (Abstract)

본 연구에서는 우리나라 대표 도시인 서울, 대전, 부산시의 PM 2.5 미세먼지 농도 변화를 인근 중국도시들의 PM 2.5 데이터의 포함여부에 따라 관찰하였다. 데이터 분석 기간은 2014년 5월 16일 9시부터 2021년 12월 31일 23시까지로, 1시간 간격의 데이터를 기준으로 하였다. 측정소 오류 등으로 발생하는 누락 데이터는 MICE (Multivariate Imputation by Chained Equations) 알고리즘으로 처리하였으며, 분석 데이터의 활용 당위성을 확보하기 위하여 데이터간의 인과관계를 Granger Causality Test를 수행하였다. 우리나라 서울, 대전, 부산시의 PM2.5변화에 영향을 주는 인과요인으로 중국 5개 도시(베이징, 텐진, 상하이, 칭다오, 선양)의 PM2.5 농도와 우리나라 도시별 기상 및 대기오염 변수와의 인과관계를 파악하였다. 그리고 인과관계에 있는 기상, 대기오염물질들을 예측 모델에 활용하였다.
예측 모델은 딥러닝 모델인 LSTM (Long-Short Term Memory), GRU (Gated Recurrent Unit), Bi-LSTM (Bidirectional LSTM) 들로 설정하였으며, 한 달(720 hour)의 데이터 세트에 기반하여 1시간 후, 3시간 후, 6시간 후, 12시간 후, 24시간 후, 48시간 후, 72시간 후를 예측하고 성능을 비교하여 도시의 시간별 적합모델을 각각 제시하였다.
분석 결과, 세 가지 딥러닝 모델은 3시간 이내의 단기예측에서 R2 value가 약 0.8 로 높은 성능을 유사하게 보였다. 그리고 24시간 이내의 예측에서는 R2 value가 약 0.7 로 여전히 높은 예측 성능을 보였다. 특히, 과거 및 미래 시간 정보를 모두 모델 예측에 사용하는 Bi-LSTM 모델은 24시간 이후 장기 예측에서도 R2 value가 약 0.4 ~ 0.6의 예측 정확도를 나타냈다.
또한, 중국 5개 도시의 PM 2.5 값을 분석에 활용한 모델의 성능을 비교하였다. 단기예측의 R2 value는 중국도시를 포함하면 약 0.901 ~ 0.919, 중국도시를 포함하지 않으면 0.871 ~ 0.886 으로 나타났으며, 장기 예측의 R2 value는 중국도시를 포함하면 약 0.609 ~ 0.685, 중국도시를 포함하지 않으면 0.572 ~ 0.587 로 나타났다. 서울, 대전, 부산시의 PM2.5 농도 예측에서 인근 중국 도시의 PM2.5 값을 활용하는 것이 중요하며 서울, 대전, 부산 순으로 예측 성능이 저하됨에 따라 중국의 영향의 크기는 서울, 대전, 부산 순임을 확인할 수 있었다.
끝으로 미세먼지 저감 조치를 시행한 시기에서의 서울, 대전, 부산시의 예측 성능도 장단기 모두 살펴보았다. 저감조치 시기의 단기 예측은 R2 value가 약 0.801 ~ 0.885로 나타났으며, 전체 기간의 R2 value인 약 0.901 ~ 0.916 보다 다소 낮았으나 여전히 높은 예측성능을 보여주고 있었다. 또한 장기 예측은 R2 value가 약 로 나타났으며, 전체 기간의 R2 value인 약 0.901 ~ 0.916 보다 다소 낮았으나 여전히 높은 예측성능을 보여주고 있었다.
본 연구를 통해 도시별로 PM2.5 농도변화에 영향을 미치는 기상, 대기오염물질 변수의 인과요인을 규명하고 예보 기간별로 서로 다른 모델을 설계함으로써 정부가 지역별 차별화된 모델을 각각 구축함으로써 대기오염 대책을 마련할 수 있도록 하였다. 즉, 서울은 장단기 모두 Bi-LSTM 모델이 적합하고, 대전은 6시간 이내의 단기예측에서는 LSTM모델, 6시간 이후의 예측에서는 Bi-LSTM 모델이 적합하였으며, 부산은 1시간 예측에서는 GRU모델, 이후에는 Bi-LSTM 모델이 적합한 것으로 나타났다.
본 연구에서의 학술적, 정책적 기여는 우리나라 대표 도시인 서울, 대전, 부산시의 PM 2.5 농도 변화에 영향을 주는 인과관계 요인들은 중국 5대 도시의 PM 2.5 농도 변화, 각 도시별 기상 변수들과 대기오염 변수들임을 도출하였고, 3시간 이내가 가장 인과관계가 강함을 예측성능을 통해서도 확인하였다. 그리고 인과관계에 있는 데이터를 LSTM, GRU, Bi-LSTM 딥러닝 모델에 적용한 것이다. 선행연구와 같이 우리나라의 PM2.5 예측에서도 LSTM, GRU 모델은 단기에서 높은 성능을 보였으며 Bi-LSTM 모델은 장단기 모두에서 높은 성능을 나타냈다. 더구나 우리나라 PM 2.5 의 예측 성능을 높이려면, 영향을 주는 인근 중국도시의 PM2.5 농도변화를 활용하는 것이 중요함을 확인하였으며, 딥러닝 모델은 고농도 미세먼지 저감 대책이 시행된 시기의 결과에서도 여전히 높은 정확도롤 보임을 확인하였다. 본 연구를 통해, 우리나라 도시마다 예측 시간대별로 각기 다른 모델을 설계한다면 지역별 높은 예측 성능을 바탕으로 정부는 대기오염 대책을 선제적으로 마련할 수 있을 것이다.

번역하기

본 연구에서는 우리나라 대표 도시인 서울, 대전, 부산시의 PM 2.5 미세먼지 농도 변화를 인근 중국도시들의 PM 2.5 데이터의 포함여부에 따라 관찰하였다. 데이터 분석 기간은 2014년 5월 16일 9...

다국어 초록 (Multilingual Abstract)

In this study, changes in the concentration of PM 2.5 in Korea's representative cities, Seoul, Daejeon, and Busan, were observed depending on whether the PM 2.5 data of nearby Chinese cities was included. The data analysis period was from 9:00 on May 16, 2014 to 23:00 on December 31, 2021, based on data at 1-hour intervals. Missing data caused by measurement station errors were processed with the MICE (Multivariate Imputation by Chained Equations) algorithm, and the Granger Causality Test was performed to determine the causal relationship between the data to ensure the justification for using the analyzed data. Causal factors affecting changes in PM2.5 in Seoul, Daejeon, and Busan, Korea, between PM2.5 in five Chinese cities (Beijing, Tianjin, Shanghai, Qingdao, and Shenyang) and meteorological and air pollution variables in each city in Korea relationship was understood. In addition, weather and air pollutants in a causal relationship were used in the prediction model.
The prediction model was set with deep learning models such as LSTM (Long-Short Term Memory), GRU (Gated Recurrent Unit), and Bi-LSTM (Bidirectional LSTM), and after 1 hour based on a data set of 1 month (720 hours), after 3 hours, after 6 hours, after 12 hours, after 24 hours, after 48 hours, after 72 hours, and by comparing performance, suitable models for each city were presented.
As a result of the analysis, the three deep learning models showed similar high performance with an R2 value of about 0.8 in short-term prediction within 3 hours. And in the prediction within 24 hours, the R2 value was about 0.7, which still showed high prediction performance. In particular, the Bi-LSTM model, which uses both past and future time information for model prediction, showed prediction accuracy of about 0.4 to 0.6 in R2 value even in long-term prediction after 24 hours.
In addition, the performance of models using the PM 2.5 of 5 cities in China was compared for analysis. The R2 value of short-term prediction was about 0.901 ~ 0.919 including Chinese cities and 0.871 ~ 0.886 without including Chinese cities, and the R2 value of long-term prediction was about 0.609 ~ 0.685, including Chinese cities. If not, it was 0.572 ~ 0.587. In predicting PM2.5 in Seoul, Daejeon, and Busan, it is important to use the PM2.5 of nearby Chinese cities. I was able to confirm that it was pure.
Lastly, the prediction performance of Seoul, Daejeon, and Busan during the time when fine dust reduction measures were implemented was also examined in both the short and long term. The short-term prediction of the reduction action period showed an R2 value of about 0.801 ~ 0.885, which was slightly lower than the R2 value of about 0.901 ~ 0.916 of the entire period, but still showed high predictive performance. In addition, the long-term prediction showed an R2 value of about , which was slightly lower than the R2 value of about 0.901 ~ 0.916 for the entire period, but still showed high predictive performance.
Through this study, the causal factors of meteorological and air pollutant variables that affect PM2.5 change in each city are identified, and different models are designed for each forecast period so that the government can establish air pollution countermeasures by establishing differentiated models for each region. made it possible to provide That is, in Seoul, the Bi-LSTM model was suitable for both the short and long term, and for Daejeon, the LSTM model was suitable for short-term prediction within 6 hours and the Bi-LSTM model was suitable for prediction after 6 hours. showed that the Bi-LSTM model was suitable.
Academic and policy contributions in this study are that the causal factors that affect changes in PM 2.5 in Seoul, Daejeon, and Busan, which are representative cities in Korea, are changes in PM 2.5 in China's five largest cities, meteorological variables and air pollution variables in each city. It was derived, and it was also confirmed through predictive performance that the causal relationship was the strongest within 3 hours. And causal data was applied to LSTM, GRU, and Bi-LSTM deep learning models. As in previous studies, in Korea's PM2.5 prediction, the LSTM and GRU models showed high performance in the short term, and the Bi-LSTM model showed high performance in both the short and long term. Moreover, in order to improve the prediction performance of PM 2.5 in Korea, it was confirmed that it is important to utilize changes in the PM2.5 in neighboring Chinese cities that affect it, and the deep learning model is still High accuracy was confirmed. Through this study, if different models are designed for each forecasting time period for each city in Korea, the government will be able to preemptively prepare air pollution measures based on high forecasting performance for each region.

번역하기

목차 (Table of Contents)

제 1 장 서론 1
제 1 절 연구의 배경 및 목적 1
제 2 절 선행연구 검토 6
1. 데이터의 결측 처리 6
2. 분석데이터 선별 8

제 1 장 서론 1
제 1 절 연구의 배경 및 목적 1
제 2 절 선행연구 검토 6
1. 데이터의 결측 처리 6
2. 분석데이터 선별 8
3. 적합한 예측모델 설계 10
제 2 장 연구의 범위 및 방법론 고찰 16
제 1 절 연구의 범위 및 본 연구의 착안점 16
1. 연구의 범위 16
2. 본 연구의 착안점 21
제 2 절 연구 방법론 고찰 24
1. 데이터 결측치 처리(MICE 방법론) 24
2. Unit Root Test (단위근 검정) 27
3. Granger Causality Test(인과관계 분석) 29
4. LSTM, GRU, Bi-LSTM 딥러닝 모델 31
1) LSTM (Long Short-Term Memory model) 모델 33
2) GRU (Gated Recurrent Unit) 모델 35
3) Bi-LSTM (Bidirectional LSTM) 모델 37
4) 예측 성능 비교 모델 38
제 3 장 도시별 모델 분석 및 결과 39
제 1 절 서울시 분석 및 결과 39
1. 데이터 전처리 39
2. Granger Causality Test 42
3. LSTM, GRU, Bi-LSTM 모델 분석 46
제 2 절 대전시 분석 및 결과 55
1. 데이터 전처리 55
2. Granger Causality Test 57
3. LSTM, GRU, Bi-LSTM 모델 분석 61
제 3 절 부산시 분석 및 결과 67
1. 데이터 전처리 67
2. Granger Causality Test 69
3. LSTM, GRU, Bi-LSTM 모델 분석 73
제 4 장 결론 80
참고문헌 83
영문초록 98

상세검색

RISS 보유자료

상세검색

해외전자자료

딥러닝 모델의 PM 2.5 예측성능비교 : LSTM, GRU, 양방향 LSTM (서울, 대전, 부산시를 대상으로) = Comparison of PM 2.5 prediction performance of deep learning model : LSTM, GRU, bidirectional LSTM(A case study of Seoul, Daejeon and Busan cities)

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료