RISS 검색 - 국내학술지논문

다국어 입력

あぁかがさざただなはばぱまやゃらわゎんいぃきぎしじちぢにひびぴみりうぅくぐすずつづっぬふぶぷむゆゅるえぇけげせぜてでねへべぺめれおぉこごそぞとどのほぼぽもよょろを

アァカサザタダナハバパマヤャラワヮンイィキギシジチヂニヒビピミリウゥクグスズツヅッヌフブプムユュルエェケゲセゼテデヘベペメレオォコゴソゾトドノホボポモヨョロヲ ―

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)

中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.

ㅥ ㅦ ㅧ ㅨ ㅩ ㅪ ㅫ ㅬ ㅭ ㅮ ㅯ ㅰ ㅱ ㅲ ㅳ ㅴ ㅵ ㅶ ㅷ ㅸ ㅹ ㅺ ㅻ ㅼ ㅽ ㅾ ㅿ ㆀ ㆁ ㆂ ㆃ ㆄ ㆅ ㆆ ㆇ ㆈ ㆉ ㆊ ㆋ ㆌ ㆍ ㆎ

Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ σ τ υ φ χ ψ ω

á à Á À é è É È ç Ç ê

Ä Ö Ü ä ö ü ß

ְ ֳ ֲ ֱ ָ ַ ֵ ֶ ִ ֹ ּ ֻ ׂ ׁ ּ פ ם ן ו ט א ר ק ף ך ל ח י ע כ ג ד ש ץ ת צ מ נ ה ב

‘ ’ “ ” 〔〕〈〉「」『』【】＂（）［］｛｝

± × ÷ ≠ ≤ ≥ ∞ ∴ ♂ ♀ ∠ ⊥ ⌒ ∂ ∇ ≡ ≒ ≪ ≫ √ ∽ ∝ ∵ ∫ ∬ ∈ ∋ ⊆ ⊇ ⊂ ⊃ ∪ ∩ ∧ ∨ ￢ ⇒ ⇔ ∀ ∃ ∮ ∑ ∏ ＋－＜＝＞

、。 · ‥ … ¨ 〃 ― ∥ ＼ ∼ ´ ～ ˇ ˘ ˝ ˚ ˙ ¸ ˛ ¡ ¿ ː ！＇，．／：；？＾＿｀｜

½ ⅓ ⅔ ¼ ¾ ⅛ ⅜ ⅝ ⅞ ¹ ² ³ ⁴ ⁿ ₁ ₂ ₃ ₄

Æ Ð Ħ Ĳ Ł Ø Œ Þ Ŧ Ŋ æ đ ð ħ ı ĳ ĸ ŀ ł ø œ ß þ ŧ ŋ ŉ

А Б В Г Д Е Ё Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я

′ ″ ℃ Å ￠￡￥ ¤ ℉ ‰ ＄％Ｆ￦㎕㎖㎗ ℓ ㎘㏄㎣㎤㎥㎦㎙㎚㎛㎜㎝㎞㎟㎠㎡㎢㏊㎍㎎㎏㏏㎈㎉㏈㎧㎨㎰㎱㎲㎳㎴㎵㎶㎷㎸㎹㎀㎁㎂㎃㎄㎺㎻㎽㎾㎿㎐㎑㎒㎓㎔ Ω ㏀㏁㎊㎋㎌㏖㏅㎭㎮㎯㏛㎩㎪㎫㎬㏝㏐㏓㏃㏉㏜㏆

§ ※ ☆ ★ ○ ● ◎ ◇ ◆ □ ■ △ ▽ → ← ↑ ↓ ↔ 〓 ◁ ◀ ▷ ▶ ♤ ♠ ♡ ♥ ♧ ♣ ⊙ ◈ ▣ ◐ ◑ ▒ ▤ ▥ ▨ ▧ ▦ ▩ ♨ ☏ ☎ ☜ ☞ ¶ † ‡ ↕ ↗ ↙ ↖ ↘ ♭ ♩ ♪ ♬ ㉿㈜ № ㏇ ™ ㏂㏘ ℡ ＃＆＊＠ ª º

ⅰ ⅱ ⅲ ⅳ ⅴ ⅵ ⅶ ⅷ ⅸ ⅹ Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ Ⅹ

ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ک ل م ن ه و ی

닫기

최근 검색 목록
전체삭제 닫기

RISS 인기검색어

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
기후변화 시나리오를 활용한 공간정보 기반 극단적 기후사상 분석 도구(EEAT) 개발

한국진,이명진 대한원격탐사학회 2020 大韓遠隔探査學會誌 Vol.36 No.3
- 원문보기 2
  ScienceON
  
  KISS
기후변화 시나리오는 기후변화 대응 연구의 기반이 되는 사항으로, 대용량 시공간 데이터로 구성되어있다. 데이터의 관점에서는 1종의 시나리오가 약 83 기가바이트(Giga bytes) 이상의 대용량이며, 데이터 형식은 반정형으로 검색, 추출, 저장 및 분석 등 활용상 제약이 있다. 본 연구에서는 대용량, 다중시기 기후변화 시나리오의 활용을 편리하게 개선하기 위하여 공간정보 기반의 극단적 기후사상 분석 도구를 개발하였다. 또한, 개발된 도구를 RCP8.5 기후변화 시나리오에 적용하여 과거 발생한 집중호우 임계치가 미래 발생 가능한 시기와 공간에 대한 시범 분석을 수행하였다. 분석결과, 3일 누적 강우량 587.6 mm 이상인 날이 2080년대 약 76회 발생하는 것으로 분석되었으며, 집중호우는 국지적으로 발생하였다. 개발된 분석도구는 초기 설정부터 분석결과를도출하는 전 과정이 단일 플랫폼에서 구현되도록 하였다. 더불어 상용 소프트웨어가 없어도 분석결과를 다양한 형식(웹 문서형식(HTML), 이미지(PNG), 기후변화 시나리오(ESR), 통계(XLS))으로 구현되도록 하였다. 따라서 본 분석도구 활용을 통해 기후변화에 대한 미래 전망이나 취약성 평가 등의 활용에 도움이 될 것으로 사료되며, 향후 제공될 기후변화 보고서에 따른 기후변화 시나리오 분석 도구 개발에도 사용될 것으로 기대된다. Climate change scenarios are the basis of research to cope with climate change, and consist of large-scale spatio-temporal data. From the data point of view, one scenario has a large capacity of about 83 gigabytes or more, and the data format is semi-structured, making it difficult to utilize the data through means such as search, extraction, archiving and analysis. In this study, a tool for analyzing extreme climate events based on spatial information is developed to improve the usability of large-scale, multi-period climate change scenarios. In addition, a pilot analysis is conducted on the time and space in which the heavy rain thresholds that occurred in the past can occur in the future, by applying the developed tool to the RCP8.5 climate change scenario. As a result, the days with a cumulative rainfall of more than 587.6 mm over three days would account for about 76 days in the 2080s, and localized heavy rains would occur. The developed analysis tool was designed to facilitate the entire process from the initial setting through to deriving analysis results on a single platform, and enabled the results of the analysis to be implemented in various formats without using specific commercial software: web document format (HTML), image (PNG), climate change scenario (ESR), statistics (XLS). Therefore, the utilization of this analysis tool is considered to be useful for determining future prospects for climate change or vulnerability assessment, etc., and it is expected to be used to develop an analysis tool for climate change scenarios based on climate change reports to be presented in the future.
2
초미세먼지 측정을 위한 보급형 센서 활용성 평가

한국진(Kuk-Jin Han) 한국정보과학회 2018 한국정보과학회 학술발표논문집 Vol.2018 No.12
- 원문보기
3
환경 분야 빅데이터 수집방법 연구 : 대기질 데이터를 중심으로

한국진 ( Kj Han ),강성원,김도연,김영인 한국환경정책평가연구원 2017 한국환경정책평가연구원 기초연구보고서 Vol.2017 No.-
- 원문보기
본 연구는 지능정보사회의 근간인 빅데이터에 대한 이해를 통해 환경 연구에 활용 가능한 빅데이터를 식별하고 데이터 기반 연구혁신을 위한 수집 방법으로서 환경 빅데이터 수집-저장의 절차와 프레임워크(안)를 제시하였다. 미래 사회와 연구 패러다임의 중심에 선 빅데이터를 환경 연구에 활용하기 위해서는 빅데이터에 대한 충분한 이해와 적극적인 활용이 필요하다. 더불어 환경 분야 빅데이터에 대한 식별 및 대응(안)도 마련되어야 한다. 이에 대한 사례로서 한국환경공단의 대기질 빅데이터 및 그 서비스를 분석하였고 분석과정을 통해 빅데이터 수집-저장방법의 절차를 검토하고 수집방법에 대한 프레임워크(안)를 제시하였다. 본 연구의 주요 내용은 다음의 3가지로 요약할 수 있다. ○ 빅데이터의 이해 빅데이터는 데이터의 수집-저장-분석-(시각화)-예측의 절차를 갖고 있지만 사회 전반적으로 다양한 이해와 의미를 갖고 있어 환경 분야 빅데이터 또한 다른 접근방법 및 이해가 필요하다. 그동안 빅데이터가 정부 주도형으로 추진되어 양적 성장을 이뤄냈고 우리나라에서는 공공데이터포털을 통해 데이터가 없더라도 데이터 제공자를 찾을 수 있는 제도적 장치가 마련되어 있다. 그러나 데이터 처리를 위한 첫 번째 단계인 수집-저장 단계에서는 성장보다 접근성, 활용성이 요구되며 연구자의 애로사항이나 수요를 고려한 데이터를 활용할 수 있는 수요자 관점에서 수집방법을 검토하였다. ○ 환경 분야 빅데이터 환경 분야의 빅데이터라 함은 모든 분야의 데이터를 일컫는다고 해도 과언이 아니다. 따라서 수요자 중심의 데이터 우선순위를 부여하고 그 사례를 제시하였다. 공공데이터포털 활용 신청 순위의 검토 및 한국환경정책·평가연구원 연구자 대상의 데이터 활용 온라인 설문을 통해 기상기후 및 대기질 데이터가 도출되었다. 이 가운데 활용성이 우수하고 동일한 규모의 데이터셋을 제공하고 있는 한국환경공단의 대기질 데이터 및 데이터 서비스를 분석하였다. ○ 수집방법의 절차화 위와 같은 분석을 통해 연구자들에게 특정 빅데이터에 대한 수집방법만을 제시한다면 기존의 방법과 다르지 않다고 판단하였다. 이에 도출된 수집방법을 통해 수집-저장방법의 절차를 마련하고 이를 프레임워크(안)으로 제시하고자 하였다. 이를 활용하면 다른 환경 빅데이터를 활용하는 연구에도 적용할 수 있고 컴퓨팅 플랫폼에도 적용이 수월하다. 또한 빅데이터 수집-저장 프레임워크(안)를 통해 활용 가능한 구체적인 소프트웨어 등의 컴퓨팅환경을 언급하여 데이터 기반 연구수행 체계로의 전환 또는 접근이 용이하도록 안을 제시하였다. The purpose of this study is identify the big data that can be used for environmental research through understanding the big data which is the basis of intelligent information society and to develop a procedure and framework of environment big data. In order to using the big data as a center of future and research paradigm, it is necessary to understand and actively apply the big data. In addition, identification and countermeasures for environmental data should be prepared. As a case study, it analyzed the air quality data and services of Airkorea, the process of scraping and storing the big data through service analytic process and presented a framework for scraping method.
4
환경정책 연구데이터 품질관리체계 구축 방안

한국진(Kukjin Han),진대용(Daeyong Jin),조윤랑(Yoonrang Cho) 한국정보과학회 2022 한국정보과학회 학술발표논문집 Vol.2022 No.12
- 원문보기
5
폭염에 의한 축산폐사와 뉴스 빅데이터의 상관관계 분석

박종철,한국진,채여라 한국지리학회 2019 한국지리학회지 Vol.8 No.3
- 원문보기
- 복사/대출신청
소셜 빅데이터는 재난을 조기 탐지하는 정보의 원천이 될 수 있으며, 재난의 공간적 분포를 이해하기 위한 새로운 가능성을 내포하고 있다. 이를 위해서는 뉴스 빅데이터로부터 수집된 정보와 실제 사건의 관계에 대한 이해가 선행되어야 한다. 본 연구의 목적은 뉴스 빅데이터의 분석 결과와 폭염에 의한 가축 폐사와의 비교를 통해 두 자료의 관계에 대한 이해를 증진시키는 것이다. 가축 폐사가 증가하는 기온 구간에서 축산 피해 관련 뉴스는 다른 시기에 비해 두 배 이상 증가하였다. 하지만 뉴스 건수의 정점은 가축 폐사의 정점으로부터 약 6일 후에 나타나고 있었다. 가축 폐사가 증가하는 기온 구간에서 뉴스의 주요어는 ‘폐사’이었다. 7월 중순 이전의 뉴스에서 주요어는 ‘대응’, ‘예방’이었고, 7월 중순부터 8월 중순에는 ‘폐사’, 8월 중순 이후에는 ‘물가’가 주요어를 이루고 있었다. 사회적 이슈에 의해 특정 주요어의 빈도가 높아지기도 하지만 ‘폐사’라는 주요어는 대체로 실제 폐사가 집중되는 기온 구간 및 시기에 등장하고 있었다. Social big data can be a source of information for early detection of disasters. Furthermore, it contains new possibilities for understanding the spatial distribution of disasters. Understanding the relationship between information obtained from news big data and actual events is essential to do this. The purpose of this study is to improve the understanding of the relationship between the two data by comparing the results of the analysis of news big data and livestock mortality caused by heat waves. The number of news was doubled during the period livestock mortality increased. However, the number of news reached the peak after six days the livestock mortality reached the peak. In the temperature range where livestock mortality increased, the main keyword of the news was ’mortality’. In the news before mid-July, the main keywords were ‘response’ and ‘prevention’, and the main keyword was ‘mortality’ from mid-July to mid-August. Since mid-August, ‘price’ was the main keyword. Although the frequency of some key words is affected by social issues, the key word 'mortality' appeared mostly in temperature ranges and periods of actual mortality.
6
생활밀착형 환경이슈에 대한 수요반영 개선 연구 민원 빅데이터 분석을 중심으로

진대용,강성원,한국진,김진형,김도연,강선아 한국환경연구원 2019 수시연구보고서 Vol.2019 No.-
- 원문보기
본 연구는 빅데이터 분석을 통한 생활밀착형 환경 이슈의 수요반영 개선 방안에 대한 연구이다. 시민들의 환경문제에 대한 인식이 높아지면서 미세먼지, 폐기물/쓰레기, 소음, 악취 등 다양한 환경문제가 이슈로 떠오르고 있다. 하지만 시민들이 실제로 해결을 요구하는 환경문제와 환경정책의 대응 사이에는 괴리가 존재한다. 이에 본 연구에서는 이런 시민들의 일상생활과 밀접한 관련이 있는 ‘민원’에서 발생하는 모든 환경 문제를 ‘생활밀착형 환경이슈’로 정의하고 이에 대한 수요를 반영할 수 있는 방안을 제시하였다. 먼저 환경부 유사민원(국민신문고 공개민원) 분석을 통해 민원에서 나타나는 전반적인 환경이슈들을 분석하였다. LDA 토픽 모델링을 수행하여 ‘생활환경’, ‘건설 및 가축 폐기물’, ‘환경영향평가’, ‘유해화학물질’, ‘대기오염물질 및 배출시설’, ‘폐수’, ‘의료 및 사업장 폐기물’ 7개의 토픽으로 구성하였는데, 전체적으로 볼 때 소음, 쓰레기, 미세먼지 등을 포함하는 ‘생활환경’ 이슈와 관련한 민원이 상대적으로 증가 추세가 있었다. ‘생활환경’ 에서는 2015년까지는 ‘공사소음, ‘층간소음’, ‘교통소음’, ‘공장소음’ 등의 다양한 소음 문제의 해결을 요구하는 민원이 대다수 였지만, 2016년 이후는 미세먼지 이슈가 등장하면서 미세먼지가 가장 높은 빈도수를 보였다. 특히 ‘미세먼지’의 경우 ‘아이’들의 건강에 대한 우려와 더불어 관련 대책을 요구하는 민원이 많았다. ‘건설 및 가축 폐기물’ 및 ‘의료 및 사업장 폐기물’ 에서는 처리, 분리수거, 재활용 등에 관련된 내용이 많았으며, 특히 부가가치가 높은 건축 폐기물에 대한 ‘순환 골재’ 등에 대한 인식 개선이 필요한 것으로 나타났다. ‘환경영향평가’에서는 2018년 ‘소규모 환경영향평가’의 수요가 급격하게 증가하였으며, ‘폐수’에서는 폐수(배출시설), 수질 등과 관련된 민원이 꾸준히 나타나고 있었고, 각종 축산폐수 등으로 인한 ‘지하수’에 대한 내용이 증가하는 추세를 보였다. ‘유해화학물질’ 에서는 설치검사, 안전검사, 설치검사, 영업허가, 취급시설, 신고대상 등에 내용이 많았고, 대기오염물질 및 배출시설’에서는 대기배출시설, 배출허용기준, 방지시설, 자가측정, 악취배출 허용 및 해당 여부 등의 내용이 많았다. 세종특별자치시에서는 ‘소음’, ‘악취’와 관련한 민원이 많았다. 신도시의 특성상 각종 주거시설 및 상업시설의 소음과 먼지로 인한 민원이 다수 발생한 것으로 보인다. 따라서 소음의 원인을 추적하고 적시에 대응하는 동시에, 도로변에서 발생하는 소음을 막기 위한 방음벽 설치 등의 정책적 지원이 필요할 것으로 사료된다. 악취 문제에 대한 대책도 필요하다. 비료, 쓰레기악취, 축사악취 등으로 인한 악취가 다수 발생하고 있으므로 이에 대한 대처가 필요하다. 또한 단지 내, 아파트, 상가, 그리고 특히 버스정류장 등에서 자주 발생하는 쓰레기 문제에 대응하기 위한 정책 및 전기차 충전소 설치, 및 보조금 지급에 관련한 문제에 대해서도 보다 건설적인 대책이 필요해 보인다. 환경정책의 최종 수요자는 국민이므로, 이들이 해결을 요구하는 환경이슈를 다양한 경로로 파악하는 것이 중요하다. 민원은 환경 텍스트 중 시민들의 실제 생활과 관련성이 높은 환경문제의 시각을 반영하고 있어 좋은 정책수립의 근거를 찾을 수 있을 것으로 판단된다. 현재 시민 대다수가 많은 환경문제 에서도 미세먼지 이슈에 촉각을 곤두세우고 있다. 그런 한편으로 실제 민원에서는 이 외에도 공사소음, 쓰레기, 악취 등의 문제에 대한 해결을 요구하는 비중이 높은 것으로 나타나 이에 대한 적극적인 대응이 필요한 실정이다. 미세먼지는 단기간에 해결할 수 있는 문제가 아니며 국내의 문제해결과 더불어 국제적인 협력을 필요로 한다. 반면에 소음, 쓰레기, 악취 등은 충분한 논의를 통해 규제, 피해보상, 단속강화 등이 이루어진다면 그 피해를 줄일 수 있을 것으로 보인다.
7
사회ㆍ환경이슈 선제적 대응을 위한 환경 데이터 허브 구축 및 운영

진대용,표종철,한국진,김도연,조윤랑 한국환경연구원 2021 사업보고서 Vol.2021 No.-
- 원문보기
Ⅰ. 서 론 1. 연구의 필요성 및 목적 □ 사회·경제 대전환의 핵심요소인 ‘데이터 댐’ 구축 필요 ○ 데이터 수집과 활용을 위한 데이터 허브(data hub) 필요 - 공공 및 민간 데이터는 미래 산업의 핵심 동력 - 데이터 지도, 데이터 연계 및 분석 서비스 등 데이터 댐의 新가치 창출 필요 ※ 데이터 댐: 데이터 수집 후 표준화하여 다시 공유하는 것 ○ 대규모 사회·환경 이슈 대응을 위한 데이터 활용 곤란 - 코로나19, 미세먼지, 가습기 살균제 등 대규모 사회·환경 이슈 발생 - 사회·환경 이슈 대응을 위한 환경관련 데이터가 산재되어 수집과 활용 곤란 □ 사회·환경 이슈 대응을 위한 데이터 허브 구축 중장기 로드맵 제시 ○ 환경정책 연구의 디지털 전환을 위한 데이터 허브 구축 방안 마련 - 주요 구축 사례 검토를 통한 환경 데이터 허브 구축의 필수요소 도출 - 연구데이터 리포지터리(IDR)를 기반으로 저장소 중심의 데이터 허브 시범 구축 ○ 효율적인 데이터 허브 구축을 위한 중장기 로드맵 제시 - 다양한 사회·환경 이슈 대응과 데이터 기반 의사결정 지원을 위한 데이터 발굴 - 산재된 데이터와 다양한 데이터 분석 플랫폼 활용을 고려한 중장기 로드맵 제시 2. 연구의 범위 및 방법 □ (시범) 구축 수행 후 향후 개선을 위한 중장기 구축 로드맵 제시 ○ 데이터 허브 구축 사례 검토하여 데이터 허브 필수기능 도출 - 데이터와 분석서비스, 데이터맵, 사용자 접근성 향상 등 주요 기능 사례 분석 ○ 사회·환경 분석 이슈 대응을 위한 데이터 허브의 기능 정의 - 데이터 기반 사회·환경 이슈 분석 사례 축적 및 데이터 분석의 장점과 한계점 검토 ○ 환경 데이터 허브 시범 구축 후 향후 확대 추진을 위한 중장기 로드맵 제안 - IDR 시스템을 중심으로 환경 데이터 허브 시범 구축 후 중장기 로드맵 제안 Ⅱ. 환경 데이터 허브 구축 방안 1. 데이터 허브 구축 개요 □ 환경 분야에 적용 가능한 데이터 허브 검토 필요 ○ 데이터 기반 대비 빈약한 데이터 분석 플랫폼과 데이터 허브 - 영국: 데이터 기반의 사회문제 해결과 행정데이터 분석연구 활용 지원 - 싱가포르: 국가 차원의 이슈 분석을 위한 범정부 플랫폼 운영 - 미국: 사이버 물리시스템(CPS) 기반의 스마트도시 데이터 허브 구축 및 활용 - 우리나라: 환경부 수집-저장 데이터 기반 구축, 연계 및 활용 제한 2. 주요 데이터 허브 사례 □ 공공데이터포털 ○ 국내 최대 데이터 허브로 「공공데이터법」에 따라 설치 및 운영 - 파일데이터 약 4만 건, 오픈데이터 약 7,000건, 표준데이터 약 1만 건 보유 ○ 다양한 관점의 데이터 지도인 국가데이터맵 제공 ○ 시각화, 국민참여지도, 위치정보 시각화 등 시각화 서비스 제공 □ 국가통계포털 ○ 국내 최대 통계 데이터 허브로 「통계법」에 따라 국내외 통계 제공 ○ 다양한 관점 데이터 지도와 e-지방지표(시각화) 등 시각화 제공 ○ 마이크로데이터 통합서비스 등 전문서비스 제공 □ 빅데이터 공통기반 혜안포털 ○ 범정부 빅데이터 분석 플랫폼 서비스 ○ SNS 텍스트 마이닝 분석과 시각화 제공, 대체로 느림 ○ 공동활용데이터 등록관리시스템 제공 □ 환경정보융합 빅데이터 플랫폼(환경데이터포털) ○ 환경 분야 전문 데이터 수집-저장 포털 ○ 데이터 분석 플랫폼 서비스 4종을 제공, 느리고 불편 ○ 2022년 이후 차세대 고도화 예정 □ 환경 비즈니스 빅데이터 플랫폼 ○ 환경 분야 데이터 유통 플랫폼 ○ 다양한 텍스트 마이닝 시각화 결과와 환경 데이터 시각화 예제 제공 ○ 모두 17개 공공과 민간이 참여 □ 연구데이터 리포지터리 ○ 연구데이터를 공유하는 시스템 - Open Science의 핵심 구성요소: 연구데이터 ㆍ NASA, 인공위성 데이터 제공 ㆍ CERN, 국제대형강입자충돌기 실험데이터 제공 ㆍ 바이오 분야의 유전체 데이터 공유 서비스 ㆍ 출판 분야의 Nature와 Springer, Elsevier ○ 연구 결과 및 과정을 개방, 공유하는 오픈 사이언스 개념 대두 - OECD: 개방성, 효과성, 지속가능성 등 13개 원칙 제시 - ISC: 공공데이터에 대한 보편적이고 동등한 접근을 증진하기 위한 14개 권고사항 제시 - 미국: 국가 수준의 연방기구의 디지털 데이터 관리 및 수집 시행, 국가연구기관 중심의 데이터 관리와 공유 정책 시행, 인프라 및 데이터 공유 프로그램 운영 - 유럽: 국가 저장소와 함께 유럽 전체 네트워크 OpenAIRE 구축, 투자 프로젝트의 연구결과 관리, 출판물과 문헌 관리 ○ 국가 차원의 체계적인 연구데이터 관리와 공유를 위한 정책과 제도 - 미국: NSF, NIH 등 연방기금 1억 달러 이상 지출 연방기관 R&D의 연구데이터 관리와 공동 활용을 위한 지침 제정 - 영국, 호주: 연구데이터 관리와 활용을 위한 정책 수립 ○ 국외 연구데이터 플랫폼 운영: 유럽, 미국, 영국, 일본, 호주 등 3. 데이터 허브의 주요 기능 □ 데이터 지도 ○ 방대한 데이터를 효과적으로 이용하는 데 활용 ○ 분류별, 지역별, 키워드별, 분야별 다양한 관점으로 제공 ○ 환경 분야는 키워드 접근 순서에 따라 다중 관점의 분류체계 필요 □ 데이터 표준화 ○ 누구든 해당 데이터를 쉽게 활용할 수 있도록 가공하는 것 의미 ○ 국제 표준화는 빅데이터의 수직, 수평적 상호운용성을 고려하여 추진 ○ 국내 표준화는 빅데이터 처리를 위해 일부 요소에만 적용 중 □ 빅데이터 분석 및 활용체계 ○ 데이터 지도와 연계하여 데이터를 확인 및 분석, 시각화하는 체계를 의미 ○ 데이터 분석 플랫폼 서비스와 유사한 기능 □ 공공데이터와 데이터 기반 행정의 업무 지원 ○ 최근 데이터 관련 법률과 관련 계획·평가 대응 증가 ○ DMP-연구데이터 등록으로 데이터 발굴, 현황 파악, 실적 증명이 가능해짐 ○ 다만, 환경 데이터 허브와 인트라넷 정보시스템 연동 필요 Ⅲ. 환경 데이터 허브 중심 코로나19 이슈 분석 1. 데이터 현황 검토 □ 환경통계 데이터는 신뢰도가 높지만 통계 산출에 많은 시간이 소요되며 시공간적 한계 존재 □ 신용카드 데이터는 지역별, 업종별 카드이용 현황 및 코로나19, 미세먼지 등 사회·환경 이슈 분석을 위한 소비 빅데이터 제공 ○ ’20~’21년 데이터바우처 사업을 통해 코로나19 관련 BC카드 소비데이터 확보 및 분석 수행 □ 사회·환경 이슈 도출 및 분석을 위해 SNS, 언론 보도자료 등 텍스트 자료 수집 및 활용 가능 ○ 텍스트 마이닝 분석으로 코로나19 사태 이후 발현한 환경 이슈* 도출 * 환경 이슈: 1) 쓰레기(폐기물 등) 증가, 2) 대기오염(대기질) 감소, 3) 에너지(전기, 가스 등) 증가 2. 코로나19에 의해 (준)실시간으로 발생한 환경 이슈 분석 □ 코로나19로 발현한 환경 이슈를 카드데이터와 환경 데이터를 융합 분석하여 (준)실시간으로 발생하는 환경 이슈에 대응하는 시의적절한 정책 개발 가능 ○ 카드데이터 기반 소비형태 변화 분석을 통해 발현 가능한 환경 이슈(폐기물 증가, 대기오염 감소, 에너지 사용량 증가) 분석 ○ 분석 결과, 코로나19 확진자가 증가하면 배달앱의 이용금액 및 건수가 모두 증가하고 대중교통과 주유 이용금액 및 건수는 모두 감소, 지역난방은 양의 상관관계로 보이나, 계절적인 특징으로 겨울철 지역난방 사용이 높아서 나타난 것으로 판단 3. 코로나19 사회적 거리두기 정책 전후 분석 □ 코로나19 사태 이후 사회적 거리두기 정책 전후 코로나 확진자 및 카드이용 변화 분석을 통해 정부 개입 효과 분석 수행 ○ 사회적 거리두기 기간을 기준으로 전후 4주(1개월) 데이터를 비교 분석 - 사회적 거리두기 단계에 따라 4개 구간(’20.3.22~’20.4.19, ’20.8.30~’20.9.13, ’20.9.14~’20.10.11, ’20.12.8~’20.12.28) ○ 코로나19 확진자 증감량의 산식에 사용되는 변수의 평균 변화 분석을 통해 정책 전후 차이 확인 ○ 정책 전과 후 추세에 대한 검증 및 검증된 추세를 기반으로 비교 분석 결과 4개 구간 모두에서 추세 변화 확인 4. 환경 데이터 허브의 추가 요구사항 □ 사회·환경 이슈의 탐지 및 현황 분석 제공 ○ 문헌, 언론, 보도자료와 포털 등의 데이터 수집 자동화 필요 ○ 사회·환경 이슈 조기 탐지를 위한 연관·관련 이슈 분석, 절차 필요 □ 사회·환경 이슈 분석을 위한 데이터 확보 및 공유기반 구축 ○ 공공 및 민간 데이터를 효율적으로 제공하기 위한 기능 필요 ○ 사회·환경 이슈 분석을 위한 데이터의 범위 검토, 데이터의 제공 및 분석 사례 구축 □ 데이터의 특성 및 범위의 검토 ○ 데이터의 신뢰도와 이슈 대응의 신속성 등 상황을 고려하여 데이터 활용 ○ 데이터의 공통 활용성 측면에서 검토하여 공동활용데이터로 활용 ○ 데이터의 접근성과 지속가능성을 고려하여 연구데이터 선정 □ 사회·환경 이슈 분석을 위한 분석 도구 활용방안 검토 ○ 모든 연구데이터가 분석데이터로 활용되지 않음 ○ 사회·환경 이슈 분석을 위한 분석 도구와 활용사례 발굴 필요 □ 정책적인 시사점을 도출할 수 있는 데이터 기반 정책 의사결정 지원체계 구축 ○ 빅데이터는 함축적 의미를 가진 간소화를 통해 분석되기 때문에 전문가의 해석과 정책화 등 의사결정을 위한 추가적인 절차가 반드시 필요 ○ 데이터기반 정책의사결정 지원체계 구축 필수 Ⅳ. 환경 데이터 허브 시범 구축 1. 환경 데이터 허브 구축의 필수요소 □ 데이터 세트 ○ 질적으로 우수한 데이터 확보 방안 필요 - 환경정책에 활용 가능한 데이터 수요조사 - 수집경로별 데이터 수집 자동화 - 환경부 데이터 실무협의체 참여 등 데이터 네트워크 발굴 - 데이터 세트 구축 사업과 데이터 지원 사업의 공모 참여 - 연구자 접근성 개선과 업무효율성 홍보 등 □ 데이터 저장소 ○ 메타정보 운영관리의 편리성과 무결성 유지를 병행할 방안 필요 - 데이터의 제출, 갱신, 검색 기능과 메타데이터 관리 기능 필요 - DMP, 권한관리, 외부 데이터와 데이터 분석 플랫폼 연계 활용 □ 데이터 분석 플랫폼 ○ 데이터 분석을 위한 데이터 파이프라인 구축 방안 필요 - 데이터의 적재, 전처리, 분석, 검증과 시각화 확인이 가능해야 함 - 프로그래밍 언어와 라이브러리 등 코드 사용의 편리함 고려 - 데이터 저장소와의 데이터 연계, 데이터 분석 결과의 유연한 저장 - 수치예측, 텍스트·이미지 분석 등 주요 AI 및 데이터 분석 모듈의 이용자 편의 2. 환경 데이터 허브 구축 □ 사전 검토사항 ○ 연구데이터 컬렉션 - 효율적인 조회와 검색결과 제공: 원본 데이터 여부, 출처, 데이터의 위치 등 - 최상위 컬렉션에 공동활용데이터와 과제수행 연도 반영 ㆍ 공동활용데이터: 기후변화, 녹색전환, 대기환경, 물관리, 국토환경, 자원순환, 환경 보건, 환경영향평가, 지표통계, 기타(외부) 등 모두 10개 ㆍ 과제수행 연도별 컬렉션은 과제종류별 컬렉션을 담고, 그 하위에 과제명 컬렉션 존재 ※ 컬렉션: 연구데이터와 연구데이터의 메타데이터를 담고 있는 캐비닛 - 연구데이터 분류체계 ○ 데이터 인용 - 효율적인 연구수행으로 데이터 활용의 선순환 생태계 조성 ㆍ 선행 연구자의 공로 인정 ㆍ 후행 연구자는 연구 과정·결과의 재생 및 활용 ㆍ 연구결과의 재이용을 통해 연구성과 확산에 기여 ㆍ 연구자 간 연구결과의 신뢰와 투명성 제고 - KEI 형식, MLA, APA, ISO 690 등 모두 4종의 인용 문구 표시 - DOI 출판 기능 제공 ○ 데이터 지도 - 효율적인 데이터 검색 ㆍ 활용하고자 하는 데이터에 대한 명확한 지식이 없는 이용자도 사용 ※ 통합 데이터 지도: 분류별, 지역별, 키워드별, 분야별 접근방식 제공 ※ 공공데이터포털: 트리맵과 검색기능 병행 제공, 데이터의 비중 파악 유리 ○ 데이터 관리 절차 - 데이터 구축과 관리를 통하여 체계적인 연구데이터 수집-저장 가능 ㆍ 데이터 구축: 데이터 확인과 검토를 통해 데이터 분류 수행과 데이터 표준화를 위한 메타데이터 부여 ㆍ 데이터 관리: 우선순위를 구분하여 중요데이터와 일반데이터로 분류하고 데이터 품질관리, 데이터 공개 여부 결정, 데이터 보완, 생애주기 관리 수행 ㆍ DMP-연구데이터 동기화와 기획-수행-종료에 따라 단계별 생애주기 관리 필요 ○ 프레임워크 구축 - KEI-IDR 시스템은 연구데이터 저장소로 이용하고 DMP-연구데이터를 활용 - 연구DB는 인트라넷 시스템을 이용하고 연구정보 연동 - 빅데이터 분석 플랫폼은 KEI 빅데이터 분석 플랫폼 시범서비스를 활용 - 외부 허브는 데이터, 분석, 인프라 등 목적에 맞도록 연동 - 외부 데이터는 공공데이터포털, 국가통계포털, AI데이터허브, 빅카인즈 등 목적에 맞게 연동 ○ 시범 구축 - 사전 검토사항과 데이터 관리 절차, 환경 데이터 허브 프레임워크를 기반으로 환경 데이터 허브를 시범 구축 ㆍ 자동으로 갱신되는 데이트를 수집하기 위해 동적 데이터 기능 구축 ㆍ 이용자 간 데이터 공유 기능과 데이터 보호를 위해 보존 기간 기능 구축 ㆍ 외부 학술DB 검색 기능과 데이터 지도, 외부 데이터 기능을 구축 ㆍ 물리적인 저장소 NAS로 교체 ○ 외부 데이터 활용방안 - 공동활용데이터 컬렉션 분리: 연구 수행에 자주 사용하는 데이터, 분류기준이 범용적인 데이터 ㆍ OpenAPI, WebDAV, FTP 등을 통해 원격에서 데이터 활용 가능 - 데이터포털과 데이터 분석 플랫폼 ㆍ 환경 빅데이터 분석 플랫폼 시범서비스, 환경 Data Science 전환연구 서비스와 개인 분석환경 활용 ㆍ 데이터의 활용이 더 중요한 경우, 외부의 데이터 분석 플랫폼을 이용하는 것이 유리함 ㆍ MLOps: 분석환경을 온라인으로 전환하는 조직에서 활용 ○ 환경 데이터 허브 고도화 방안 - DMP 관리기능 개선: 템플릿 복사, 순서 변경, 엑셀 반출 등 - 개인 저장소 기능 개선: 업로드/다운로드, 공유, OpenAPI 사용, 프로그래밍 코드 연동 등 3. 환경 데이터 허브 확대 구축 로드맵 □ KEI형 환경 데이터 허브 로드맵 제시 ○ 제약조건을 고려하여 KEI형 환경 데이터 허브 로드맵(간소화) 제시 - 제약조건 ㆍ 모든 연구데이터의 특성을 고려하여 환경 데이터 허브를 구축하는 것은 불가능 ㆍ 일반적인 정보시스템 구축 방법론 적용도 현실성이 없음 ㆍ 과업수행기간, 예산, 인력, 사회·환경 변화 고려 ㆍ 연구자, 정책가, 수요기업과 대국민 등 수요자를 단계적으로 확대 - 제안사항 ㆍ 환경 데이터 허브 구축 계획 수립: 2021년 표준 IDR 최신 업데이트가 마무리되는 시점부터 8개월간 수행, 제약조건을 고려하여 약 2개년에 대한 추진계획 작성 ㆍ 환경 데이터 허브 인프라 구축: KEI-IDR 시스템과 외부 분석 플랫폼 서비스, 외부 데이터포털 등 다른 시스템과의 연계를 고려하여 구축, 유연한 분류체계 반영 ㆍ 환경 데이터 허브 고도화: 외부 서비스 변경사항 반영, 수요조사 후 결과반영, 데이터 지도 확대 ○ 로드맵(간소화) 제약조건을 고려하여 환경 데이터 허브 확대 로드맵 제시 - 데이터 구축 ㆍ 1단계(2020~2021년): 연구데이터 등록과 내부 공개 시범 운영, 환경 데이터 플랫폼 현황 파악과 분석, 외부 데이터 연동기능 구축 ㆍ 2단계(2022~2024년): 모든 정부출연금 과제까지 연구데이터 등록 대상과제 확대, 연구데이터의 외부공개 절차 마련, 환경 전문가 수요조사 결과에 따른 AI데이터 구축 ㆍ 3단계(2025년~): 수탁과제까지 연구데이터 등록 대상과제 확대, 연구데이터의 외부공개 대상 확대 - 데이터 저장소 구축 ㆍ 1단계(2020~2021년): 표준 IDR 도입과 KEI-IDR 구축, 인트라넷 정보시스템 연동, 기본 데이터 통계, 데이터 지도와 외부 데이터 검색 기능 구축 ㆍ 2단계(2022~2024년): KEI-IDR 안정화, 데이터 연계와 활용 기능 확대 ㆍ 3단계(2025년~): 데이터 저장소 구축 완료, 데이터 아카이빙 서비스의 고도화 추진 - 데이터 분석 플랫폼 도입 ㆍ 1단계(2020~2021년): 기존의 분석 플랫폼 서비스와 서버, 개인 분석환경 활용으로 1단계 없음 ㆍ 2단계(2022~2024년): 분석환경에서 연구데이터를 직접 연결하는 기능개선과 전문가 중심의 대시보드 구축 ㆍ 3단계(2025년~): 데이터 융합 활용사례 제공과 대시보드 고도화 - 성공조건: 전담조직 운영 > 예산확보, 제도개선 병행 ㆍ 제도개선: 안전하고 유연한 접근이 가능하도록 정보보안 정책 개선 ㆍ 전담조직: 데이터 관련 법률에 따라 전담조직 설치, 데이터 과학자와 기술자 자체 수급(전문교육 등 활용), 환경 매체별 부서와 전담조직의 협업 강화 ㆍ 예산확보: KEI에서 집행 가능한 수준으로 조정(협의) 가능. 다만, 예산이 연속적으로 보장되어야 함 Ⅴ. 결론 (학술적 성과) 1. 결론 □ 연구자 인식전환 및 협업 생태계 구축 ○ 다양한 사회·환경 이슈 파악, 분석, 정책 결정을 위한 현실적인 방안과 사전대응체계 마련 필요 - 지속적인 사회·환경 이슈 발생으로 데이터 기반 대응사례 증가 추세 - 환경통계와 사회통계 융복합, 환경정책연구의 경계 약화 ○ 빠른 데이터 생산에 유연한 데이터 활용을 통한 정책 반영 - 사람과 사물 등 물리적 요소가 모두 연결되고, 상호작용하는 상황 반영 - 데이터에 대한 관점 변화: 적시적인 결과 도출과 데이터 신뢰의 중요도 판단 - 환경정책연구의 제약: 시의성 높은 이슈 분석에 사용할 수 있는 데이터가 미미 ○ 통계 구축의 주기성 단축과 대체재로서의 데이터 선별 지원 - 사회·환경 이슈 분석에 있어 다양한 데이터의 범위와 한계점 검토 - 의료 폐기물 발생량이 폭증하였으나, 2021년 쓰레기 배출량 공식통계 없음 □ 환경 데이터 허브 시범 구축과 환경 데이터 활용 기반 구축 - 환경 데이터 허브 구축의 필수요소 도출: 데이터 세트, 데이터 저장소, 데이터 분석 플랫폼 - KEI형 중장기 환경 데이터 허브 로드맵 제시 □ 사회·환경 이슈 분석을 위한 환경 데이터 허브의 요건 제시 - 사회·환경 이슈 분석을 데이터의 확보, 데이터 공유를 위한 기초 기반 구축, 분석 도구 구축 등 필요 - 정책적인 시사점을 도출할 수 있는 데이터 기반 정책 의사결정 지원체계 구축 필요 Ⅰ. Background and Aims of Research 1. Heading □ Construction of ‘data dam’, a key element of the great social and economic transformation ○ A data hub is required for data collection and utilization - Public and private data are the key drivers of the future industry - It is necessary to create new values for ‘data dam’ such as data maps, data linkage and analysis services. ※ Data Dam: Collecting data, standardizing it, and sharing it againn ○ Difficulty in using data to respond to large-scale social and environmental issues - Large-scale social and environmental issues such as COVID-19, fine dust, and humidifier disinfectant occurred - It is difficult to collect and utilize environment-related data to respond to social and environmental issues. □ Present a mid- to long-term roadmap for building a data hub to respond to social and environmental issues ○ Prepare a plan to build a data hub for the digital transformation of environmental policy research - Derivation of essential elements for building an environmental data hub through a review of major implementation cases - Based on Institutional Data Repository (IDR), build a storage-centric data hub pilot ○ Present a mid- to long-term roadmap for building an efficient data hub - Discovering data to respond to various social and environmental issues and support data-based decision-making - Presenting a mid- to long-term roadmap considering scattered data and utilization of various data analysis platforms 2. Research Scope and Methods □ (Pilot) After implementation, present a mid-to-long-term roadmap for future improvement ○ Deriving essential data hub functions through data hub implementation case review - Major functions : data and analysis service, data map, and user accessibility improvement ○ Functional definition of data hub to respond to social/environmental analysis issues - Accumulation of data-based social and environmental issue analysis cases and review of strengths and limitations of data analysis ○ Proposal of mid- to long-term roadmap for future improvement after pilot implementation of environmental data hub - Proposal of mid- to long-term roadmap after pilot implementation of environmental data hub based on IDR system Ⅱ. Strategies to Build an Environmental Data Hub 1. Overview of building a data hub □ Applicable data hubs in the environmental field need to be reviewed ○ Poor data analysis platform and data hub - UK: Support for data-based social problem solving and administrative data analysis research use - Singapore: Pan-government platform operation for national issue analysis - U.S.: Establishment and utilization of smart city data hub based on cyber physical system (CPS) - Korea: Establishment of collection-storage data base by the Ministry of Environment, and restriction of connection and use 2. Key Data Hub Examples □ Public Data Portal ○ Installed and operated according to the Public Data Act as the largest data hub in Korea - About 40,000 file data, 7,000 open data, and 10,000 standard data ○ Provides a national data map from various perspectives ○ Provide visualization services such as public participation map, location information visualization and so on □ National Statistics Portal ○ As the largest statistical data hub in Korea, domestic and foreign statistics are provided in accordance with the Statistical Act ○ Provide visualizations such as data maps from various viewpoints and e-local indicators (visualization) ○ Provide professional services such as micro data integration service □ Big data common-based insight portal ○ Pan-government big data analysis platform service ○ SNS text mining analysis and visualization provided, generally slow ○ Provide joint use data registration management system □ Environmental information convergence big data platform (environmental data portal) ○ Specialized data collection-storage portal in the environmental field ○ Provides 4 types of data analysis platform services, but it is slow and inconvenient ○ Next-generation upgrade planned after 2022 □ Environmental Business Big Data Platform ○ Environment field data distribution platform ○ Provide various text mining visualization results and environmental data visualization examples ○ A total of 17 public and private sectors participated □ Research data repository ○ A system for sharing research data - Core components of Open Science: Research data ㆍ NASA provides satellite data ㆍ CERN provides experimental data for the International Large Hadron Collider ㆍ Genomic data sharing service in the bio field ㆍ Nature, Springer, and Elsevier in publishing ○ The rise of the concept of open science to open and share research results and exaggerations ㆍ OECD: 13 principles including openness, effectiveness, and sustainability ㆍ ISC: makes 14 recommendations to promote universal and equal access to public data; ㆍ U.S.: Implementation of digital data management and collection by federal agencies at the national level, implementation of data management and sharing policies centered on national research institutes, and operating programs for infrastructure and data sharing ㆍ Europe: Establishment of OpenAIRE, an entire European network with national repositories, management of research results of investment projects, management of publications and literature ○ Overseas research data platform operation: Europe, USA, UK, Japan, Australia, etc. 3. Key Features of Data Hub □ Data Map ○ Utilize to effectively use vast amounts of data ○ Provide various viewpoints by classification, region, keyword, and field ○ In the environmental field, a multi-view classification system is required according to the keyword access order □ Data standardization ○ It means processing the data so that anyone can use it easily. ○ International standardization is promoted in consideration of the vertical and horizontal interoperability of big data ○ Domestic standardization is being applied only to some elements for big data processing □ Big data analysis and utilization system ○ Refers to a system for checking, analyzing, and visualizing data in connection with the data map ○ Support for functions similar to data analysis platform services □ Support for public data and data-based administration work ○ Recently, data-related laws have increased and related plans and evaluation responses have increased ○ DMP-Research data registration makes it possible to discover data, understand the current status, and prove performance ○ However, necessary to connect environmental data hubs and intranet information systemsm. Ⅲ. Analysis of COVID-19 Issues Centered on Environmental Data Hub 1. Data Status Review □ Although environmental statistics data is highly reliable, it takes a lot of time to calculate statistics, and there are temporal and spatial limitations □ Credit card data provides consumption big data for analysis of card usage status by industry by sector and social and environmental issues such as COVID-19 and fine dust. ○ Securing and analyzing data on BC card consumption related to COVID-19 through ‘data voucher business’ in ’20~’21 □ Possible to collect and use text data such as SNS and press releases for deriving and analyzing social and environmental issues. ○ Deriving environmental issues* that emerged after the COVID-19 crisis through text mining analysis * Environmental issues: 1) Increase in garbage (waste, etc.), 2) Decrease in air pollution (air quality), 3) Increase in energy (electricity, gas, etc.) 2. Analysis of environmental issues caused by near real-time due to COVID-19 □ Possible to develop timely policies to respond to environmental issues that occur in (quasi) real-time by convergence analysis of card data and environmental data for environmental issues that have emerged due to COVID-19 ○ Analysis of possible environmental issues (increase in waste, decrease in air pollution, increase in energy consumption) through card data-based consumption pattern change analysis ○ As a result of the analysis, when the number of confirmed COVID-19 cases increases, both the amount and number of delivery apps use increases, and the amount and number of use of public transportation and gas both decrease. It is considered that this is due to the high 3. Analysis of before and after COVID-19 social distancing policy □ Analyze the effect of government intervention by analyzing the changes in COVID-19 confirmed cases and card use before and after the social distancing policy after the COVID-19 inciden ○ Comparative analysis of data before and after 4 weeks (1 month) based on the social distancing period - 4 sections according to the social distancing stage (‘20.3.22~`20.4.19, `20.8.30~`20.9.13, `20.9.14~`20.10.11, `20.12.8~`20.12.28) ○ Confirm the existence of differences before and after the policy by analyzing the average change of the variables used in the calculation of the increase or decrease of the number of COVID-19 confirmed cases ○ Verification of the trend before and after the policy and comparison analysis based on the verified trend confirms that there is a trend change in all 4 sections 4. Additional Requirements for Environment Data Hub □ Detection of social/environmental issues and provide current status analysis ○ Need to automate data collection of documents, press, press releases and portals ○ Relevant and related issue analysis and procedures required for early detection of social and environmental issues □ Securing data for analysis of social/environmental issues and building a base for sharing ○ Need functions to efficiently provide public and private data ○ Review the scope of data for analyzing social and environmental issues, provide data, and establish examples of analysis □ Review of the nature and scope of the data ○ Data is utilized in consideration of circumstances such as reliability of data and prompt response to issues ○ Used as data for common use by reviewing the aspect of common use of data. ○ Research data was selected in consideration of data accessibility and Sustainability □ Review of the use of analysis tools to analyze social and environmental issues ○ Not all research data is used as analysis data ○ Necessary to discover analysis tools and use cases to analyze social and environmental issues □ Establishment of data-based policy decision support system that can draw policy implications ○ Since big data is analyzed through simplification with implications, additional procedures for decision-making such as expert interpretation and policymaking are absolutely necessary ○ Essential to establish a data-based policy decision support system Ⅳ. Implementation of a Pilot Environment Data Hub 1. Essentials of Building an Environmental Data Hub □ Data set ○ Demand for measures to secure quality data - Data demand survey that can be used for environmental policy - Automate data collection by collection path - Discover data networks such as participation in the data working group of the Ministry of Environment - Participation in competition for data set construction and data support projects - Improving researcher access and promoting work efficiency, etc. □ Data Repository ○ Demand for a method that can simultaneously maintain the convenience and integrity of meta information operation and management - Data submission, update, search function and metadata management function are required - Utilization of DMP, authority management, connection of external data and data analysis platform □ Data analysis platform ○ Need to build a data pipeline for data analysis - Data loading, pre-processing, analysis, verification, and visualization should be possible - Consider the convenience of using codes such as programming languages and libraries - Data linkage with data storage, flexible storage of data analysis results - User convenience of major AI and data analysis modules such as numerical prediction and text/image analysis 2. Building an Environmental Data Hub □ Preliminary considerations ○ Research data collection - Provide efficient inquiry and search results: whether original data, source, location of data, etc. - The joint use data and the year of the assignment are reflected in the top-level collection ㆍ Shared data: climate change, green transition, atmospheric environment, water management, land environment, resource circulation, environmental health, environmental impact assessment, index statistics, other (external), etc. ㆍ The collection by year of task execution contains collections by task type, and the task name collection exists under it ※ Collection: Cabinet containing research data and metadata of research data Research data categorization system ○ data citation - Creating a virtuous cycle ecosystem of data utilization through efficient research ㆍ Recognition of merits of previous researchers ㆍ Subsequent researchers can reproduce and utilize the research process and results ㆍ Contribute to the spread of research results through reuse of research results ㆍ Enhance the trust and transparency of research results among researchers - All 4 types of quotation marks including KEI format, MLA, APA, ISO 690 - DOI publishing function provided ○ Data map - Efficient data search ㆍ Users who do not have clear knowledge of the data they want to use can also use it ※ Integrated data map: Provides approaches by classification, region, keyword, and field ※ Public data portal: Treemap and search function are provided concurrently, and it is advantageous to understand the weight of data ○ Data management procedure - Systematic research data collection and storage possible through data construction and data management ㆍ Data construction: data classification and data standardization through data verification and review ㆍ Data management: Classify priorities into important data and general data, and perform data quality management, data disclosure decision, data supplementation, and life cycle management ㆍ Step-by-step life cycle management is required according to DMP-research data synchronization and planning-execution-completion ○ Building a framework - The KEI-IDR system is used as a research data repository and DMP-research data is used - Research DB uses intranet system and research information is linked - Big data analysis platform utilizes KEI big data analysis platform pilot service - External hubs are linked to suit the purpose of data, analysis, infrastructure, etc. - External data is linked according to the purpose of public data portal, national statistics portal, AI data hub, Big Kinds, etc. ○ Pilot build - Pilot implementation of an environmental data hub based on preliminary reviews, data management procedures, and ㆍ Build dynamic data capabilities to collect automatically updated data ㆍ Establishment of data sharing function among users and retention period function for data protection ㆍ Build external academic DB search function, data map, and external data function ㆍ Replace with physical storage NAS ○ External data utilization - Separation of data collection for common use: data frequently used for research, data with universal classification criteria ㆍ Data can be used remotely through OpenAPI, WebDAV, FTP, etc. - Data portal and data analysis platform ㆍ Use of environmental big data analysis platform pilot service, environmental data science conversion research service and personal analysis environment ㆍ When the use of data is more important, it is advantageous to use an external data analysis platform ㆍ MLOps: Used by organizations moving their analytics environment online ○ Environmental data hub upgrade plan - Improvement of DMP management function: copy template, change order, export to Excel, etc. - Improvement of personal storage function: upload/download, sharing, use of OpenAPI, interworking with programming code, etc. 3. Roadmap for expanding the environmental data hub □ Presenting a roadmap for the KEI-type environmental data hub ○ Presenting a KEI-type environmental data hub roadmap (simplification) in consideration of constraints - Constraints ㆍ Impossible to build an environmental data hub considering the characteristics of all research data. ㆍ Not practical to apply the general information system construction methodology ㆍ Consider changes in task execution period, budget, manpower, and social/environment ㆍ Step by step expansion of consumers such as researchers, policy makers, demanding companies and the general public - Proposals ㆍ Establishment of environmental data hub construction plan: Implemented for 8 months from the time the latest update of the 2021 standard IDR is completed ㆍ Establishment of environmental data hub infrastructure: Considering the linkage between the KEI-IDR system and other systems such as external analysis platform services and external data portals, and reflecting the flexible classification system ㆍ Environmental data hub upgrade: reflect external service changes, reflect results after demand survey, expand data map ○ Roadmap (simplification) Presenting a roadmap for expanding the environmental data hub in consideration of constraints - Data construction ㆍ Stage 1 (2020~2021): Research data registration and internal public pilot operation, environmental data platform status identification and analysis, and external data interlocking function establishment ㆍ Stage 2 (2022~2024): Expand research data registration projects to all government subsidy projects, prepare procedures for external disclosure of research data, and build AI data based on the results of environmental expert demand surveys ㆍ Stage 3 (from 2025): Expand research data registration target projects to consignment projects, expand research data disclosure target - Construction of data repository ㆍ Stage 1 (2020~2021): Introduction of standard IDR and establishment of KEI-IDR, interworking of intranet information system, establishment of basic data statistics, data map and external data search function ㆍ Stage 2 (2022~2024): stabilization of KEI-IDR, expansion of data linkage and utilization functions ㆍ Stage 3 (from 2025): Completion of data storage construction, advancement of data archiving service - Introduction of data analysis platform ㆍ Stage 1 (2020~2021): No phase 1 due to the use of the existing analysis platform service, server, and personal analysis environment ㆍ Stage 2 (2022~2024): Function improvement to directly connect research data in the analysis environment and establishment of an expert-oriented dashboard ㆍ Stage 3 (from 2025): Provide data convergence use cases and upgrade dashboard - Success conditions: Operation of a dedicated organization > Securing a budget and improving the system ㆍ Data policy improvement: information security policy improvement to enable safe and flexible access ㆍ Dedicated organization: Establishment of a dedicated organization in accordance with data-related laws, self-supply of data scientists and technicians (using professional training, etc.), and strengthening collaboration between departments and dedicated organizations by environmental media ㆍ Budget Securing: Possible to adjust (negotiate) to a level that is enforceable by KEI, however, the budget must be continuously guaranteed Ⅴ. Conclusion 1. Conclusion □ Improvement of researcher awareness and establishment of a collaborative ecosystem ○ Practical measures are needed to identify, analyze, and make policy decisions on various social and environmental issues, and it is necessary to prepare a system to respond in advance - Data-based response cases are increasing due to the continuous occurrence of social and environmental issues - Convergence of environmental statistics and social statistics, weakening the boundaries of environmental policy research ○ Policy reflection through flexible data utilization for rapid data production - Reflects the situation in which all physical elements such as people and objects are connected and interacted - Changes in perspective on data: timely results and determination of the importance of data trust - Constraints in environmental policy research: There is very little data available for timely issue analysis ○ Support for shortening the periodicity of statistical construction and screening data as a substitute - Review of the scope and limitations of various data in analyzing social and environmental issues - Although the amount of medical waste has increased significantly, there are no official statistics on the amount of waste in 2021 □ Establishment of a pilot environment data hub and foundation for environmental data utilization - Derivation of essential elements of building an environmental data hub: data set, data storage, data analysis platform - KEI-type mid- to long-term environmental data hub roadmap presented □ Suggestion of requirements for environmental data hub for social/environmental issue analysis - Necessary to secure data for analysis of social and environmental issues, to establish a foundation for data sharing, and to establish an analysis tool - Necessary to establish a data-based policy decision support system that can draw policy implication
8
환경 디지털 뉴딜 구현을 위한 AI 기반 환경 감시 체계 구축

진대용,표종철,김도연,조윤랑,한국진 한국환경연구원 2021 기본연구보고서 Vol.2021 No.-
- 원문보기
Ⅰ. 서 론 □ 연구의 필요성 ㅇ 환경(정책)분야에서 AI 기술 활용은 그린 뉴딜과 디지털 뉴딜 연결에 주체적인 가교역할을 할 수 있지만 그 역할을 충분히 수행하지 못하고 있음 ㅇ 환경 분야의 데이터를 AI 기술을 중심으로 체계적이고 종합적으로 활용하기 위한 전략구축이 필요한 상황임 ㅇ ‘AI 기반 환경 감시 체계’ 구축을 위해서는 환경변화탐지, 자연재해 분석, 매체별 오염 발생패턴 분석 등 사례구축이 우선적으로 필요하며 이를 통해 필요한 요소 도출 및 프로세스 설계가 필요함 □ 연구의 목적 ㅇ AI 및 XAI 복합적 활용을 통한 AI 기반 환경 자동 모니터링 및 대응을 위한 주요 사례를 구축하고 이를 토대로 ‘AI 기반 환경 감시 체계’ 구축 전략을 제시함 Ⅱ. 선행연구 □ 환경정책연구에서 AI 연구 활용 범위의 확대 ㅇ 기존 의사결정 방법론의 한계를 다수의 파라미터로 구성된 AI 모델로 개선 가능 ㅇ 환경연구에서도 AI 방법론의 활용이 확대 중임 - 수치, 이미지, 영상 등 다양한 형태의 데이터를 변수로 활용 가능하며 예측, 분류, 검출, 변화탐지 및 영향력 분석 등이 가능함 - AI는 성능 측면에서 높은 정확도를 나타내지만, 복잡한 모델 구성으로 인해 설명력이 낮은 문제가 존재함 □ 설명 가능한 인공지능(XAI: eXplainable AI)의 등장으로 예측과 동시에 영향력이 큰 요인을 확인하여 의사결정을 위한 양적 자료로 활용 가능성 확대 ㅇ 블랙박스(Black-Box) 구조로 되어 있는 인공지능 알고리즘의 투명성과 신뢰성 확보를 위해 XAI 연구가 활발해지는 추세임 - 2017년 미 방위고등연구계획국(DARPA: Defense Advanced Research Projects Agency)에서 발표한 설명 가능 인공지능 프로젝트인 XAI를 시작으로 설명 가능한 인공지능의 기술 연구가 본격적으로 전개 중임 ㅇ XAI 분석 연구는 대기오염, 수질오염, 토양오염 등의 환경오염 문제뿐만 아니라 생태계 분야 등 다양한 환경 분야에서 적용되고 있음 - XAI 중에서 LIME(Local Interpretable Model-agnostic Explanations), SHAP(SHapley Additive exPlanation), Grad-CAM(Gradient-Class Activation Map) 등의 모형이 주로 활용되고 있음 □ IoT, 드론, 무인이동체 등 다양한 애플리케이션 및 기기를 통해 데이터 수집이 가능해져 환경 빅데이터가 축적되고 있으며 AI 적용 연구가 확대되는 추세임 ㅇ 환경분야에서 생성되는 이미지 및 영상 데이터는 기후, 환경오염(대기, 수질, 토양, 소음 등) 등 다양한 분야에 관련되어 있음 - AI 기반 예측, 분류 및 결측 데이터 보간 연구 등이 활발히 수행 중임 - 예측연구뿐만 아니라, XAI 기반 예측에 영향력이 큰 요인들을 제시하여 의사결정을 위한 양적 자료로 활용 가능성 확대 Ⅲ. AI 기반 산지 변화 탐지 1. AI 기반 산지 변화 탐지 연구의 개요 □ GIS와 원격탐사 기술을 이용한 산지 변화에 대한 실태 조사, 의심지 도출 및 후속 조치와 같은 대응이 이루어지고 있지만, 산지 변화의 조기 탐지 수행을 통한 피해지역의 조기 대응과 피해 축소가 필요함 □ 따라서 본 연구는 딥러닝 기술을 이용한 산지 변화 탐지에 대한 가능성을 제안함 2. 국내외 산림지도 현황 □ 국내외 산림지도 공급 현황 ㅇ 국토정보 플랫폼(국토지리정보원), 산림공간포탈서비스(산림청), AI 허브 산림수종 항공이미지 자료(한국지능정보사회진흥원) 등 ㅇ UCI Machine Learning Repository(미국), Skyscape dataset(독일 항공우주 센터), Semantic Change detection dataset(중국 우한대학교) 등 3. AI 기반 산지 변화탐지 입력자료 구성 및 모델 구성 □ AI 모델 입력자료 구성 ㅇ AI 허브 국토환경데이터에서 산림수종 항공이미지를 활용함 ㅇ 항공영상을 128×128로 세분화하여 한 영상당 16장의 이미지로 구성하고, RGB 항공영상 정보의 정규화를 수행함 ㅇ 라벨링 데이터는 산림과 비산림으로만 구분하기 위해 바이너리 어노테이션(binary annotation)을 수행하였고, 판독 불가의 라벨이 포함된 항공 이미지는 제외함 ㅇ 수도권 지역의 학습 이미지 총 1만 6,000장과 검증 이미지 총 1,600장을 AI 모델 입력자료로 활용함 ㅇ 산지 변화 탐지 성능 테스트를 위해 카카오 지도(Kakao Map)의 동 지역 다(多) 시기 테스트 이미지 데이터셋(data set)을 구성함 □ AI 모델의 구조 ㅇ 이미지 분할(Image segmentation)에 특화된 U-Net 딥러닝 모델 구조를 적용 ㅇ 기훈련된 U-Net 딥러닝 아키텍처의 레이어 구성과 하이퍼 파라미터를 파인튜닝(fine-tuning) 하여 산지 변화 탐지 학습을 수행 4. AI 모델 산지 변화탐지 결과 및 활용방향 □ U-Net 모델의 훈련 및 검증 결과는 산림과 비산림 지역을 잘 구분하였고, 실제 라벨링 지역과 유사한 패턴을 보이는 것을 확인함 □ 훈련된 U-Net 모델에 카카오 지도의 동 지역 다(多) 시기 이미지 적용 시 산지의 변화를 잘 구분하는 것을 확인하여, 산지 변화 탐지에 대한 딥러닝 모델의 활용 가능성을 확인함 Ⅳ. AI 기반 기후·대기오염과 코로나19 상관관계 분석 1. AI 기반 기후·대기오염과 코로나19 상관관계 분석 연구개요 □ 기후변화가 코로나19의 확산에 직접적인 영향을 미친다는 증거는 없으나, 관련 논의는 지속적으로 진행 중인 상황임 □ 2020년 서울시를 대상으로 기후 및 대기오염과 코로나19의 상관관계 분석을 수행하고, 기후 및 대기오염 인자와 코로나19 확진 사이의 관계에 대해 모의한 AI 모형 구축 가능성을 검토함 2. 기후·대기오염과 코로나19 상관관계 관련 선행연구 검토 □ 최신 국내외 연구사례 분석 결과, 국가별 결과가 상이하며 기후 및 대기오염 변수가 코로나19에 직접적인 영향을 미친다고 보기에는 어려운 것으로 사료됨 ㅇ 코로나19 사태 이후 기후 및 대기오염 영향 연구가 활발히 진행 중임 - 메르스, 사스, 코로나19 등 감염병은 계절적 패턴을 보이며 기온, 습도 데이터를 활용하여 예측 가능성을 검토함 - 유럽에서는 코로나19로 인한 사망에 이산화질소(NO2)가 중요한 요소인 것으로 추정하였으며, 인도에서는 코로나19로 에어로졸 광학깊이(AOD)가 20년 만에 최저 수준으로 나타남 3. 기후·대기오염과 코로나19 상관관계 분석 및 결과 □ 2020년도 서울시 중심 기후 및 대기오염과 코로나19 상관관계 분석 시범 연구 사례 도출 ㅇ 코로나 관련 확진자 및 사망자 수, 기후 및 대기오염 데이터 수집을 통한 학습 데이터셋 구축 ㅇ 계절적 요인을 제외하기 위한 시기별(구간별) 스피어만(Spearman), 켄달(Kendall) 상관관계 분석 수행 - 전체기간 분석 결과 기온 변수가 코로나19 확진자 수와 높은 상관성을 나타냄 - 하지만 기온 변수의 코로나 시기별 상관계수 부호와 값이 크게 바뀌어 결과에 일관성 문제가 있음을 확인 ㅇ 분석 결과 한계점을 확인하였으며, 향후 분석 시 정책, 사회활동 변수 추가 필요 - 코로나19 확진자 수를 추정할 수 있는 직접적인 관련 입력변수(정책, 인구 이동 등)를 추가하여 분석 수행 필요 - 분석대상이 되는 기간이 2020년 1개 연도로, 데이터 축적을 통해 이를 늘릴 필요가 있음 Ⅴ. AI 기반 침수 흔적 탐지 1. AI 기반 침수 흔적 탐지 연구 개요 □ 오픈데이터(Open Data)를 활용하여 AI 기반 도심 침수 흔적 탐지 체계 구축 연구를 수행함 □ GIS 기반 공간 데이터 전처리, 파이썬 기반 전처리 데이터의 AI 모델 입력자료 구축, 기계학습 모델 구축을 통한 침수 흔적 탐지 학습 및 활용한 입력 데이터 중 침수 탐지에 중요한 인자 추정 □ 침수 취약 지도 작성과 중요 인자 파악 및 분석, 기후변화시나리오 데이터를 적용한 미래 침수 취약지역 예측 및 분석 수행 2. AI 기반 침수 흔적 탐지 입력자료 및 모델 구성 □ AI 모델 입력자료 구성 ㅇ 환경 빅데이터 플랫폼, 기상정보포털, 그리고 환경공간정보서비스를 통한 수문분석도, 지형분석도, 기후변화 시나리오 데이터, GIS 데이터를 활용함 ㅇ 취득한 공간 데이터의 수도권 지역으로 공간적 범위 일원화, 래스터화 및 적층을 통한 입력자료 구성을 진행함 ㅇ 랜덤 포레스트 모델 훈련을 위해 침수위선상 침수 범위의 150지점을 훈련데이터로, 50지점을 검증데이터로 사용함 □ AI 모델의 구성 ㅇ 앙상블 학습 방법을 활용하는 대표적인 기계학습 모델인 랜덤 포레스트(Random Forest) 모델 구성과 학습을 통한 수도권 지역 침수 흔적 탐지 성능 평가를 진행함 ㅇ 침수 흔적 탐지 결과에 대한 입력자료의 민감도 분석을 위해 랜덤 포레스트 모델의 변수 중요도(Variable importance)를 추정함 3. AI 모델 침수 흔적 탐지 성능 및 검증 □ 랜덤 포레스트 모델 침수 흔적 탐지 성능 평가 ㅇ 랜덤 포레스트로 학습한 침수 흔적 범위와 측정된 침수 흔적 범위와 유사한 결과를 확인함 ㅇ 훈련된 모델을 수도권 전역에 적용한 수도권 지역 침수 취약 지도를 통해 한강 수변 중심으로 침수 취약도가 높음을 확인함 4. 기후변화 시나리오를 통한 침수 흔적 예측 □ RCP 8.5 시나리오 적용을 통한 침수 흔적 변화 예측 ㅇ 미래의 RCP 시나리오를 훈련된 랜덤 포레스트 모델에 적용하여, 강수량 변화에 따른 수도권 지역 침수 흔적 범위 변화를 확인함 ㅇ 기후변화 시나리오에 따른 AI 기반 도심 침수 피해 예측 등의 활용을 기대함 Ⅵ. AI 기반 미세먼지 발생패턴 분석: 고농도 사례를 중심으로 1. AI 기반 미세먼지 발생패턴 분석 연구 개요 □ AI 기반 고농도 미세먼지 발생패턴 분석 연구의 필요성 ㅇ 우리나라의 미세먼지 농도는 관련 정책의 수립 및 적극적인 이행으로 전반적으로 감소하는 추세임 ㅇ 하지만 고농도 미세먼지 현상은 계속해서 나타나고 있고, 지속기간이 길어지는 사례는 여전히 존재하며, 국민들의 미세먼지에 대한 불안감은 아직까지 해소되지 않은 상태에서 환경에 대한 인식 및 관심이 높아짐에 따라, 관련 정책이 늘어나고 있음 ㅇ 미세먼지 발생 패턴분석을 위한 AI 모형의 구축을 수행하고 활용방안을 제시함 2. AI 기반 미세먼지 발생패턴 분석 입력자료 및 모델 구성 □ AI 모델 입력자료 구성 ㅇ 에어코리아, 기상정보포털 등을 통한 대기질, 기상·기후자료, 외부요인(중국 대기질) 자료를 활용함 ㅇ 2017~2019년 충남 지역을 대상으로 하였으며, 대기측정망을 기준으로 데이터를 재구성함 □ AI 모델 구성 ㅇ 부스팅 기반 방법을 활용하는 대표적인 기계학습 모델인 XGBoost 모델 구성 및 학습을 통해 미세먼지 추정 모형 구축을 진행 3. AI 기반 고농도 미세먼지 발생패턴 분석모델 성능 및 활용 가능성 검토 □ 미세먼지 추정 성능 테스트 ㅇ 테스트 데이터에 대해 구축된 모형에서 추정값과 실측값을 비교했을 때 대부분의 경우 경향을 추적할 수 있음을 확인함 ㅇ 하지만 고농도 미세먼지 대해서는 추정이 잘되지 않는 부분들이 일부 존재하였으며, 이 부분은 향후 학습 데이터의 증가 및 관련 변수들의 추가선정을 통해 보완할 수 있을 것으로 사료됨 □ 미세먼지 발생패턴 분석 결과 ㅇ 구축된 모형에 PDP 및 SHAP 방법론을 적용하여, 미세먼지 농도 추정에 대한 모델의 판단 근거를 도출할 수 있음을 확인함 ㅇ 미세먼지 발생패턴의 핵심인자를 파악하고, 주요 사례별로 모형값 결정에 대한 입력 변수의 기여도 분석 사례를 제시함 □ AI 기반 고농도 미세먼지 발생패턴 모형의 활용 가능성 검토 ㅇ 대기오염물질, 기상·기후 요인, 중국 대기질 데이터 등의 활용을 통해 PM2.5를 추정하는 AI 모형 구축이 가능함 ㅇ SHAP값은 구축한 AI 모델의 출력값에 의존적인 모형이며, 구축된 모형의 특성에 종속된다는 한계가 있음 ㅇ 출력된 결과는 입력변수와 출력변수의 패턴 분석을 통해 상관관계를 체계화하는 것에 가까우며, 인과관계를 보장하지 않는 한계가 있음 ㅇ 그럼에도 AI 모형으로 입력되는 변수들의 PM2.5 추정에 샘플 단위로 영향력을 제시할 수 있음 ㅇ 향후 전문가들과의 논의를 통해 미세먼지 농도 추정에 대한 기여도의 정합성을 검토하여, 신뢰도 높은 정량평가모델로 개선할 필요가 있음 Ⅶ. 결론 및 정책 제언(학술적 성과) □ 환경 디지털 뉴딜을 위한 AI 기반 환경분야 연구사례 제시 ㅇ AI 기술을 중심으로 환경 변화 탐지 사례(산지 변화 탐지), 자연재해 분석 사례(침수탐지 및 예측), 감염병 분석 사례(기후 및 대기인자와 코로나19 상관분석, 매체별 환경오염 분석 사례(미세먼지 발생패턴 분석)의 환경분야 활용 사례를 제시 ㅇ 수치, 이미지, 지리정보 등 다양한 데이터를 입력변수로 활용 가능하며, 연구목적에 따라 관심변수의 추정 및 예측, (이미지) 변화 분석, 변수의 영향력 분석 등에 활용할 수 있는 가능성을 제시함 ㅇ XAI 모형을 통해 구축된 모델의 값 출력에서 영향력이 큰 요인들을 제시하여, 의사결정을 위한 양적 자료로 활용하기 위한 방안을 제시함 □ AI 기반 감시 체계 구축을 위한 필수 요소 및 활용방안 ㅇ 다수 환경분야에 대한 실제 AI 적용을 통해, AI 기반 감시 체계 구축을 위한 필수요소 및 기본적인 모형 구축 및 분석 과정을 정립함 ㅇ AI 기반 감시 체계의 필수요소는 데이터 구축(데이터 수집 또는 생산) ⇒ AI 모형구축 ⇒ AI 모형 기반 분석 및 감시 실시 ⇒ 결과 도출 및 정책 근거자료 확보의 과정으로, 이를 통해 AI 기반 감시 체계 구축 가능 ㅇ 지속적으로 활용 가능한 환경감시 체계 구축을 위해서는 실시간 또는 주기적 자동 데이터 수집이 필수적임 ㅇ AI 모형을 구축한 뒤 모형 출력 결과를 활용 및 고려하지 못한 부분에 대한 모형 업데이트를 수행하는 등의 선순환 체계 구축 필요 ㅇ 모형 구축 및 결과 해석의 과정에서 전문지식과의 정합성이 확보되면, 향후에는 지속적(자동)으로 결과를 도출하여 환경 이슈 대응방안 수립 시 과학적 정책 근거 자료를 제시함으로써 감시 체계의 역할을 수행할 것으로 기대함 □ 후속 과제 제안 ㅇ 정밀하고 실용성 높은 분석을 위해서 고해상도의 시·공간 데이터 구축이 필요하고 구축된 데이터의 질에 따라 결과 및 활용 범위가 달라지기 때문에, 데이터 구축이 필요한 영역에 대한 검토, 목적에 맞는 고해상도 데이터 생산을 위한 연구 수행을 제안함 ㅇ 매체별 오염, 자연재해 분석 등 AI 및 XAI 모델을 구축하고, 도출된 결과를 토대로 전문가와의 정합성 검토, 물리적 모델링 및 시뮬레이션 결과 등과 비교분석을 비롯해 관련 내용을 합리적으로 반영하기 위한 연구 필요 Ⅰ. Introduction □ Research background ㅇ Use of AI technology in the environmental (policy) sector can perform an independent role as a bridge between Green New Deal and Digital New Deal, but it fails to sufficiently fulfill its role ㅇ There is a need to establish strategies to systematically and comprehensively use data in the environmental sector with focus on AI technology ㅇ To build an ‘AI-based environmental monitoring system’, it is necessary to first develop cases such as environmental change detection, natural disaster analysis, and pollution occurrence pattern analysis by media type, through which necessary elements must be derived and processes designed □ Research objective ㅇ To develop major cases for automatic AI-based environmental monitoring and response through combined use of AI and XAI and provide strategies to build an “AI-based environmental monitoring system” based on the above Ⅱ. Literature Review □ Expanding the application scope of AI studies in environmental policy research ㅇ Limitations of existing decision-making methodologies can be overcome with AI models comprised of multiple parameters ㅇ Application as environmental studies using AI methodologies is being expanded - Various forms of data such as numbers, images, and videos can be used as variables, allowing prediction, classification, detection, change detection, and impact analysis - AI shows high accuracy in terms of performance, but there is the issue of low explanatory power due to complicated model compositions □ With the emergence of explainable AI (XAI), factors with a huge impact can be predicted as well as validated, which can be used as quantitative data for decision making ㅇ XAI studies are conducted actively to ensure transparency and reliability of AI algorithms in a black box structure - Starting with the explainable AI project XAI announced by the Defense Advanced Research Projects Agency (DARPA) in the U.S. in 2017, technological research on explainable AI is being developed ㅇ Studies analyzing XAI are applied to various fields of the environment such as ecosystem in addition to environmental pollution problems such as air pollution, water pollution, and soil pollution - XAI models mostly used include local interpretable model-agnostic explanations (LIME), SHapley Additive exPlanation (SHAP), and Gradient-weighted Class Activation Mapping (Grad-CAM) □ Data can be collected using various applications and devices such as IoT, drones, and unmanned vehicles, thereby accumulating environmental big data and activating studies applying AI ㅇ Image and video data created in the environmental sector are related to various fields such as climate and environmental pollution (air, water quality, soil, noise, etc.) - Studies are actively conducted on AI-based prediction, classification and interpolation of missing values - In addition to prediction research, factors with a huge impact on XAI-based prediction are presented, which can be used as quantitative data for decision making Ⅲ. AI-based Mountain Land Change Detection 1. Overview of research on AI-based mountain land change detection □ Measures are taken using GIS and remote sensing technology such as factual surveys on mountain land changes, derivation of suspicious sites, and other follow-up measures, but there is a need for early response and decrease of damages through early detection of mountain land changes □ Therefore, this study raises the possibility of mountain land change detection using deep learning technology 2. Forest maps in Korea and overseas □ Supply of forest maps in Korea and overseas ㅇ National Geographic Information Platform, (National Geographic Information Institute), Forest Space Portal Service (Korea Forest Service), AI Hub aerial photographs of forest tree species data (National Information society Agency), etc. ㅇ UCI Machine Learning Repository (U.S.), Skyscape dataset (German Aerospace Center), Semantic Change detection dataset (Wuhan University in China), etc. 3. AI-based mountain land change detection input data and model composition □ AI model input data ㅇ Aerial photographs of forest tree species are used from AI Hub national land environment data ㅇ Aerial videos are subdivided into 128 x 128, organizing each video with 16 images and normalizing the information of RGB aerial images ㅇ For labeling data, binary annotation is performed to classify into just forests and non-forests, and aerial photographs including illegible labels are excluded ㅇ Total 16,000 images for learning and 16,000 images for validation in the capital area are used as AI model input data ㅇ The same area multi-period test image datasets on Kakao Map are formed to test the performance of mountain land change detection □ Structure of the AI model ㅇ The U-Net deep learning model structure specialized for image segmentation is applied ㅇ The layer composition of trained U-Net deep learning architecture and hyper parameters are fine-tuned to perform mountain land change detection learning 4. Results and application of AI model mountain land change detection □ The training and validation results of the U-Net model well divided forests and non-forests and showed a similar pattern as actual labeling areas □ Mountain land changes are well distinguished when applying the same area multi-period test images on Kakao Map to the trained U-Net model, which proved the applicability of deep learning models in mountain land change detection Ⅳ. Correlation Analysis of AI-based Climate/air Pollution and COVID-19 1. Overview of research in correlation analysis of AI-based climate/air pollution and COVID-19 □ There is no evidence that climate change has a direct impact on the spread of COVID-19, but related discussions are continuously being made □ Correlation analysis of climate/air pollution and COVID-19 in Seoul was conducted in 2020, and the possibility of building an AI model simulating the relationship between climate/air pollution factors and COVID-19 was reviewed 2. Literature review on correlation between climate/air pollution and COVID-19 □ After analyzing the latest research cases in Korea and overseas, the results vary among nations and proved that there is no evidence that climate and air pollution variables have a direct impact on COVID-19 ㅇ Studies are actively conducted on the impact of climate and air pollution since the COVID-19 pandemic - Infectious diseases such as MERS, SARS, and COVID-19 show a seasonal pattern and can be predicted using temperature and humidity data - NO2 was proved to be a key element of death from COVID-19 in Europe, and AOD in India turned out to be the lowest in 20 years due to COVID-19 3. Correlation analysis of climate/air pollution and COVID-19 and results □ A pilot study was conducted on correlation analysis of climate/air pollution and COVID-19 at the heart of Seoul in 2020 ㅇ Learning datasets are built by collecting confirmed cases and deaths of COVID-19, and climate and air pollution data ㅇ Spearman and Kendall correlation analyses were conducted on each section to exclude seasonal factors - The results showed that temperature was a variable highly correlated with the number of confirmed cases of COVID-19 - As a result, the correlation coefficient of temperature in each section changed significantly, proving that there is little relevance ㅇ The results proved the limitations and raised the need to add policy and social activity variables for future analysis - Must conduct analysis by adding directly related input variables (policy, population mobility, etc.) that can estimate the number of confirmed cases of COVID-19 - Must increase the analysis period by accumulating data to 1 year of 2020 Ⅴ. AI-based Inundation Trace Detection 1. Overview of research on AI-based inundation trace detection □ Research is conducted on building an AI-based urban inundation trace detection system using open data □ Preprocessing GIS-based spatial data, building AI model input data of Python-based preprocessing data, learning inundation trace detection by building machine learning and deep learning models, and estimating key factors of inundation detection among input data used □ Developing a flood susceptibility map, identifying and analyzing key factors, and conducting prediction and analysis of future flood susceptible areas applying climate change scenario data 2. AI-based inundation trace detection input data and model composition □ AI model input data ㅇ Hydrology map, topographic map, climate change scenario data, and GIS data are used on Environment Big Data Platform, Open MET Data Portal, and Environmental Space Information Service ㅇ Input data is formed by unifying, rasterizing, and stacking the spatial scope to the capital area of spatial data obtained ㅇ For random forest model training, 150 points of inundation scope in 2010 are used as training data, and 50 points as validation data □ Structure of the AI model ㅇ Inundation trace detection performance in the capital area is evaluated by composing and learning the random forest model, which is a typical machine learning model using the ensemble learning method ㅇ Variable importance of the random forest model was estimated to analyze the sensitivity of input data in inundation trace detection results 3. AI model inundation trace detection performance and validation □ Performance evaluation of the inundation trace detection using the random forest model ㅇ Similar results were found between the inundation trace scope learned by random forest and the inundation trace scope measured in 2010 ㅇ High flood susceptibility was verified around the waters of Hangang River through the flood susceptibility map of the capital area applied to all capital areas of the trained model 4. Inundation trace prediction through climate change scenario □ Inundation trace change prediction by applying the RCP 8.5 scenario ㅇ Change in the inundation trace range in the capital area is verified by change in precipitation by applying the future RCP scenario to the trained random forest model ㅇ Expected to be used in AI-based urban inundation damage prediction according to climate change scenarios Ⅵ. AI-based Particulate Matter (PM) Occurrence Pattern Analysis: Focusing on High Concentration Cases 1. Overview of research on AI-based PM occurrence pattern analysis □ Need for research on AI-based PM occurrence pattern analysis ㅇ PM concentrations in Korea are decreasing overall with establishment and active implementation of related policies ㅇ However, there is an ongoing phenomenon of high concentration PM that still lasts long, and the nation’s anxiety over PM is not yet resolved, and there are more and more related policies and interest due to the expansion of environmental awareness ㅇ Building an AI model and providing application plans for PM occurrence pattern analysis 2. AI-based PM occurrence pattern analysis input data and model composition □ AI model input data ㅇ Air quality and weather/climate data on Air Korea and Open MET Data Portal are used, as well as external factors (air quality in China) ㅇ Research is conducted on Chungnam in 2017-2019, with data restructured based on the air quality monitoring network □ Structure of the AI model ㅇ The XGBoost model, which is a typical machine learning model using the boosting technique, is developed and the PM estimation model is built through learning 3. Review of performance and applicability of the AI-based high concentration PM occurrence pattern analysis model □ PM estimation performance test ㅇ Comparing the estimated and measured values of the model built on test data, the trend was traced in most cases ㅇ However, some cases of high concentration PM were not estimated well, which can be supplemented later by increasing learning data and additionally selecting related variables □ PM occurrence pattern analysis results ㅇ It has been proved that the grounds for model judgment about PM concentration estimation can be derived by applying PDP and SHAP to the built model ㅇ Key factors of PM occurrence patterns are identified, and analysis cases on contribution of input variables in determining model values for each case are provided □ Review of the applicability of the AI-based high concentration PM occurrence pattern model ㅇ Can build an AI model estimating PM2.5 using air pollutants, weather/ climate factors, and China’s air quality data ㅇ SHAP values have limitations in that they are dependent on the output values of the AI model built and subordinate to the characteristics of the model built ㅇ The output results are closer to systemizing the correlation through pattern analysis of input and output variables without guaranteeing the causal relations ㅇ Nonetheless, the AI model can have an effect at the sample level in PM2.5 estimation of variables ㅇ By discussion with experts in the future, it is necessary to review the consistency in contribution to PM concentrations and improve into a highly reliable quantitative evaluation model Ⅶ. Conclusions and Policy Suggestions (Academic Outcomes) □ Case studies on AI-based environment for environmental Digital New Deal ㅇ This study presented cases used in the environmental sector with focus in AI technology, such as environmental change detection (mountain land change detection), natural disaster analysis (inundation control and prediction), infectious disease analysis (correlation analysis of climate/air factors and COVID-19) and environmental pollution analysis by media type (PM occurrence pattern analysis) ㅇ All kinds of data such as numbers, images, and geographical information can be used as input variables, and can be applied in estimating and predicting variables of interest, analyzing (image) changes, and analyzing variable impact depending on the research purpose ㅇ Presenting ways to use as quantitative data for decision making by providing factors with a great impact in obtaining values of the model built through the XAI model □ Essential elements and application plan to build an AI-based monitoring system ㅇ Essential elements, basic models, and analysis processes are established to build an AI-based monitoring system through many actual cases of AI application in the environmental sector ㅇ The essential elements of the AI-based monitoring system are building data (collecting or producing data) ⇒ building an AI model ⇒ analyzing and monitoring based on the AI model ⇒ deriving outcomes and securing policy grounds ㅇ Automatic real-time or regular data collection is essential for building a sustainably applicable environmental monitoring system ㅇ It is necessary to build a virtuous cycle of deriving and using data produced by building an AI model as the results and updating the model for parts not considered ㅇ By securing consistency with expert knowledge in the process of building the model and interpreting the results, the monitoring system will be able to fulfill its role by deriving continuous (automatic) results and providing scientific grounds and policy grounds when establishing measures to resolve environmental issues □ Suggestion of follow-up tasks ㅇ For precise and highly practical analysis, it is necessary to build high-resolution temporal and spatial data; thus, this study suggests review of fields that need data building and research on high-resolution data production fit for the purpose by setting the results and application scope of data quality ㅇ There is a need for research that rationally reflects and comparatively analyzes the results of consistency review with experts, physical modeling, and simulation based on building of AI and XAI models such as pollution by media type and natural disaster analysis
9
정보통신기술(ICT) 관점에서의 지속가능발전목표(SDGs) 영향 연구

김영인(YoungIn Kim),한국진(KukJin Han),윤정호(jeongHo Yoon),전형진(HyungJin Jeon) 한국정보과학회 2018 한국정보과학회 학술발표논문집 Vol.2018 No.6
- 원문보기
10
Data Science 기반 기후변화 대응 지원 플랫폼 구축을 위한 전략 마련 연구(Ⅱ)

진대용,표종철,조윤랑,한국진,김도연 한국환경연구원 2021 기후환경정책연구 Vol.2021 No.-
- 원문보기
Ⅰ. 서론 □ 연구 필요성 및 목적 ○ 전 지구적 이상기후 및 자연재해 발생 등 기후변화 현상의 심화는 자연환경뿐만 아니라 인간 활동 영역에까지 다양하게 영향을 미침 ○ 최근 우리나라는 국제 사회와 공조하여 2050 탄소중립(net-zero) 목표를 선언하고, 기후변화 대응에 적극 나서고 있음 ○ 기후변화는 온실가스를 감축하는 완화(mitigation) 연구와 피해 및 위험을 최소화하는 적응(adaptation) 연구로 구분할 수 있으나, 기후변화 원인은 복합적이기 때문에 상호 보완적인 정책이 필요함 ○ 2018년 과학기술정보통신부는 국가 R&D 추진 과정에서 축적되는 연구데이터 관리·공유를 위해『연구데이터 공유·활용 전략』을 수립한 바 있으며, 본격적으로 데이터 집중형 과학 (data-intensive science) 시대가 도래함 - 하드웨어 발전, 고성능 네트워크 등 장비의 발달로 많은 연구데이터를 생산하기 때문에 다양한 연구데이터 관리는 효과적인 연구수행의 필수적 요소가 됨 ○ 본 연구는 기후변화를 데이터 중심으로 연계하여 대응하는 것으로, 기후변화 연구를 ‘데이터 사이언스(Data Science)’로 전환하는 것임 - 데이터 사이언스는 다양한 형태의 데이터로부터 실제 현상을 이해하고 유용한 지식을 도출하는 과정을 총칭함 - 정보통신기술(ICT), 위성 데이터, 기상 재해석 데이터 등 생성되는 데이터양이 점차 증가함에 따라 이를 연계·활용하기 위한 관련 데이터 확보가 중요해짐 ○ 기후변화 대응을 위한 데이터는 다양한 기관에 산재해 있는 데다 환경데이터 분류 표준 체계가 부재한 탓에 데이터 활용에 제약 요소가 많으며, 이를 효율적이고 편리하게 활용하는 방안 마련이 시급함 □ 연구 범위 ○ 기후환경 데이터의 현황분석, 데이터관리계획(DMP)의 실행체계 구축 및 운영, 기후환경 데이터 플랫폼 구축전략과 차별화된 데이터 제공 서비스를 위한 방안을 마련함 - 위성 중심의 기후환경 응용데이터 현황 및 기후변화 완화 부문의 데이터 조사를 통해 기후변화 대응을 위한 완화-감축 부문의 데이터 인벤토리를 작성하고자 함 - 기후환경 데이터 관리의 실질적 이행체계 구축을 위한 연구데이터 범위 설정 및 DMP 도입과 연구데이터 리포지터리(IDR)를 중심으로 한 KEI형 데이터 관리 추진체계를 구축하고자 함 - 구축된 기후환경 데이터 인벤토리 및 관리체계를 토대로 환경정책연구에 활용 가능한 기후환경 데이터 서비스 제공 방안을 모색하고자 함 ㆍ 본 과제에서 기후환경 데이터는 기후변화 대응을 위한 완화 및 적응과 관련된 공개 데이터로 한정함 ㆍ 누적된 다양한 데이터가 단일 연구과제에서만 활용되는 것을 방지하고, 데이터의 공유 및 활용성을 극대화하고자 함 - 기후환경 데이터 플랫폼 구축을 위한 설문조사와 데이터 공유·활용에 관한 법·제도를 검토하고, 기후변화 데이터를 중심으로 한 정책연구의 현실적 방안을 제시하고자 함 □ 연구 내용 및 방법론 ○ 2차 연도 연구는 주요 환경 관계기관의 온실가스 감축데이터를 중심으로 구축·보완하고, 기존 인벤토리 고도화를 추진하고자 함 - 기후환경 데이터 중 위성데이터 산출물과 기상청의 기상·기후 데이터 현황 조사를 통해 기후변화 대응 데이터 범위 확대 가능성을 살펴봄 ○ KEI 연구데이터 관리 실행체계 마련 및 구축을 위한 연구데이터 정의 및 관리 필수요소를 도출하고자 함 ○ KEI 기후환경 데이터 플랫폼 구축전략 마련을 위한 전문가 의견수렴 및 조사된 연구데이터 관리체계 내용을 기반으로 KEI 기후환경 데이터 플랫폼 구축 로드맵을 마련하고자 함 ○ 향후 KEI형 데이터 플랫폼으로 확장하기 위해 수집된 기후변화 대응 데이터의 메타정보를 연구데이터 리포지터리 시스템에 시범적으로 업로드하고, 데이터 마인드맵 시범 서비스를 통해 정책활용도를 제고하고자 함 Ⅱ. 기후환경 데이터 인벤토리 고도화 □ 국내 기후환경 응용 데이터 현황 ○ 기후변화는 강수, 운량, 온도뿐만 아니라 식생분포, 토지분포 등에도 큰 영향을 미치며, 이에 대응하려면 일차적인 자료 확보가 적절히 이루어져야 함 ○ 국외에서는 미국 항공우주국(NASA)과 유럽 우주국(ESA)을 필두로 대기오염기체와 기후 변화 유발기체, 에어로졸, 식생지표 변화 등 다양한 영역을 관찰할 때 위성 자료를 사용함 ○ 국내에서도 통신해양기상위성(COMS)의 후속으로 정지궤도복합위성을 발사하고, 위성 관측을 통해 생산된 자료를 기후변화 대응 정책의 기초자료로 활용하고 있음 ○ 대표적인 국내 위성으로는 천리안해양관측위성, 천리안위성 2A호, 천리안위성 2B호 등이 있음 - 천리안해양관측위성은 적조, 해빙, 해무, 해양투기모니터링, 해사채취활동, 미세먼지 등에 활용됨 ㆍ 주요 산출물로는 용존유기물, 엽록소, 총 부유물질, 적조지수, 육상식생지수 등 총 13종의 데이터를 생산함 - 천리안위성 2A호는 천리안해양관측위성과 비교하여 다양한 관측이 가능하며, 기상재해의 감시 및 대비가 가능함 ㆍ 생산하는 기상산출물은 총 52종으로 구름탐지, 오존량, 강우강도 등 23종의 기본산출물과 산불탐지, 식생지수, 식생률, 지표면 반사도 등 29종의 부가 산출물을 생산함 - 천리안위성 2B호는 해양환경과 생태계를 관측하고, 한반도 밖의 대기오염물질 등을 감시하여, 기후변화 대응 및 미세먼지 감시를 위한 자료를 제공함 ㆍ 주요 산출물로는 대기보정, 고유광특성, 대기산출물, 해색산출물, 해양산출물, 육상 산출물 등 총 26종의 데이터를 생산함 □ 기후변화 대응 관련 데이터 현황 ○ 기후변화 대응은 온실가스를 감축하거나 흡수하는 완화(mitigation) 정책과 기후변화 피해를 저감하는 적응(adaptation) 정책의 두 가지 측면을 모두 고려해야 함 ○ 기후변화 대응을 위한 에너지, 발전, 온실가스 배출 등 기후변화 완화 데이터 현황조사를 통해 기후변화 완화와 적응정책을 연계하고자 함 ○ 기후변화 완화(온실가스 감축) 부문 데이터는 크게 에너지 통계, 국가 온실가스 인벤토리, 기타 연계 및 활용 가능한 데이터로 구분할 수 있음 - 국가에너지 통계종합정보시스템에서는 에너지 밸런스 및 국가에너지 수급 관련 통계를 비롯해 에너지통계 작성 규정에 따른 관련 기관의 통계자료를 연계·통합하여 제공함 - 국가 온실가스 인벤토리에서는 기후변화 대응을 위해 국내 온실가스 배출원·흡수원 및 배출량·흡수량 파악을 위한 데이터를 제공함 - 기타 데이터로는 민·관·학계의 배출량 산정 분석 지원과 온실가스 인벤토리 연계를 위한 교통/수송 및 전력 데이터 등이 제공됨 ○ 기후변화 적응 부문 데이터는 국가기후변화적응센터(KACCC)에서 운영 중인 시스템 내에 구축된 데이터를 기반으로 기후환경 데이터 인벤토리를 구축함 - 대표적인 기후변화 적응 시스템인 부문별 기후변화 영향 및 취약성 통합평가 모형(MOTIVE)과 기후변화 취약성 평가도구(VESTAP)에서는 기후변화 적응을 위한 취약성 평가 데이터를 제공함 ○ 기상청에서 관측·제공하는 각종 기상관측 자료, 방재기상정보 등은 기후변화의 미래예측과 대응정책 수립과 같은 다양한 분야에서 기초자료로 활용됨 - 기후변화 시나리오는 미래 기후변화로 인한 영향평가 및 피해를 최소화하는 연구의 분석 자료로 활용할 수 있으며, 이는 기후변화 대응 및 적응대책 수립·지원을 위한 필수적인 정보로 활용됨 Ⅲ. 기후환경 데이터 관리 실행체계 구축 □ KEI 연구데이터 관리 개요 ○ 2019년 데이터관리계획(DMP: Data Management Plans) 규정이 시행되며 국내 연구 데이터를 공유하고 활용하기 위한 노력이 활발히 이루어지고 있음 - 주요 선진국을 중심으로 국가연구개발사업 과제의 연구데이터 보존 및 재사용의 성공적 사례가 나오고 있으며, 오픈 데이터 활동이 전 세계적으로 확산하고 있음 - 「국가연구개발사업의 관리 등에 관한 규정」에서 연구데이터와 데이터 관리계획을 정의하고, 국가연구개발 사업 수행 시 DMP 제출 요구를 규정하여 국가 차원의 연구데이터 관리 근거를 마련함 ○ 연구데이터를 관리하고 서비스하기 위한 핵심 요소로는 DMP 작성지원, 데이터 파일 정리, 데이터 저장, 데이터 공유 및 접근, 데이터 인용, 데이터 관리교육으로 구분할 수 있음 □ 연구데이터의 수집 및 관리 ○ 연구데이터는 연구개발과제 수행 과정에서 실시하는 각종 실험, 관찰, 조사 및 분석 등을 통하여 산출된 사실 자료로서 연구 결과의 검증에 필수적인 데이터임 - 연구데이터는 연구 과정에서 생성되는 모든 데이터를 지칭하기 때문에 메일이나 기술 보고서 등과 같은 연구 기록과 구별해야 함 - 지속적 연구 활동 지원 및 연구 결과물 보존·공유를 위해서는 연구자가 소속된 연구 기관과 연구자가 활동하는 커뮤니티에서 연구 수행 과정에서 산출되는 데이터 관리가 필요함 ○ DMP란 연구 프로젝트 도중이나 종료 후에 프로젝트를 통해 생산·수집된 연구데이터가 어떻게 관리·공유되는지 기술하는 공식 문서를 의미함 - 데이터 수집 전에 DMP를 통해 충실한 데이터 설명이 가능하고, 이는 데이터에 대한 상세 내용을 기억하기 위한 연구자의 노력이 불필요하게 하며 데이터 재사용을 가능케 함 ○ DMP는 연구 라이프 사이클에 맞추어 연구계획 단계부터 데이터 생산, 수집, 관리, 보존 및 폐기, 출판, 재사용 등의 모든 과정에서 발생하는 행위임 ○ KEI 기후환경 데이터 플랫폼을 개발하려면 연구데이터 라이프 사이클을 도출하고, 이에 관한 세부 내용을 확정하는 것이 중요함 □ 연구데이터 관리 시스템 구축 ○ 데이터 리포지터리는 오픈소스로 개발되어 공개된 소프트웨어를 활용할 수 있으며, 대표적으로 DSpace와 NaRDA가 있음 - DSpace는 웹기반 인터페이스 제공을 통해 파일 제출이 쉽고 다양한 파일 수용이 가능하며, 하나의 기관을 넘어 대규모, 다분야 리포지터리로 확장이 가능함 - NaRDA는 한국과학기술정보연구원(KISTI)에서 개발·보급하는 연구데이터 리포지터리이며, 연구자의 데이터 관리 활동 주기를 고려하여 설계 및 구현됨 ㆍ NaRDA는 DMP 제출양식을 작성하고, 이를 게시 및 공유할 수 있음 ㆍ 연구 수행 중의 관리 단계에서는 연구 수행을 위한 데이터를 자유롭게 업로드·다운로드 할 수 있으며, 데이터 설명을 기술할 수 있음 ㆍ 마지막 단계에서는 연구 결과물 공유를 위한 연구데이터 등록이 가능하며, 이를 위해 메타데이터 추출 및 DOI 부여 기능을 제공함 ○ 연구데이터는 메타데이터와 원천데이터로 구성되며, 메타데이터는 데이터를 설명하는 자료로 데이터 검색 시스템에서 활용되는 색인 요소임 ○ 메타데이터란 데이터에 대한 속성을 기술하고 컨텍스트(context) 및 데이터 품질 정보를 제공하며, 다른 객체나 데이터의 특징을 문서화한 것을 일컬음 □ 연구데이터의 보존 및 공유 ○ 디지털 연구데이터를 보존하는 경우 다양한 편익이 발생하며, 보존을 위해서는 인적·물적 자원이 필요함 - 데이터 보존을 위해서는 해당 정보를 수집할 방법을 시스템화하여 제공하고, 보존 및 출판을 위한 영구식별자(DOI, ARK, UUID 등)가 부여되어야 함 ㆍ 가장 많이 쓰이는 영구식별자는 DOI로, KISTI에서 발급하는 DOI prefix를 이용해 데이터를 출판하는 기관이 suffix를 추가하여 데이터를 출판할 수 있음 ○ 데이터 출판과 관련하여 연구자의 의지가 반영될 수 있도록 하고, 이때 내부 및 외부 공유 범위 설정과 연구자의 요구 수준을 표현할 수 있는 화면 및 기능 설계가 필요함 ○ 연구자의 데이터 리터러시 능력 향상을 통해 효과적인 연구데이터 활용이나 공유, 재사용을 기대할 수 있으며, 연구데이터 공유와 재사용 활성화를 위해 데이터 공개에 대한 보상체계가 마련되어야 함 □ 연구데이터 구축 서비스 사례 및 시사점 ○ 한국지질자원연구원(KIGAM)은 연구데이터의 관리체계 부재로 인해 중복 연구가 이루어진다는 사실을 인지하고, 지질 자료 저장소 GDR을 개발하여 운영 중임 - GDR은 데이터 접근 제어 기능과 외부 연동 데이터에 DOI를 발급하고, 연구소 최초로 사업계획서에 DMP 양식을 포함하는 제도를 시행함 ○ 한국한의학연구원(KIOM)은 한의약 연구데이터 리포지터리(KMDR)를 구축하고, 이를 운영 중임 - KMDR은 한의약 분야 연구데이터의 체계적인 관리 및 공유를 위한 정보 시스템으로 데이터 관리 지원, 활용 제고를 통해 효율적인 연구수행 지원을 목적으로 구축됨 - 외부 위협으로부터 연구데이터 보호를 위한 암호화 모듈 적용과 DMP 작성 및 관리 기능을 연계하여 전 주기적인 연구데이터 관리가 가능함 ○ 국립산림과학원은 「국립산림과학원 연구사업 관리 규정」(예규 제307호)의 일부 개정 (2019.2.11)을 통하여 연구데이터 관리 의무화 조항을 신설함 - 데이터 기반 융·복합 산림과학연구 수행 지원을 위한 적극적인 연구데이터 관리 도모 및 참여 의식 고취를 목적으로 포상계획을 수립함 ○ 연구데이터 관리와 거버넌스 체계를 만들려면 연구자와 경영진 인식이 긍정적으로 변화 될 수 있도록 지속적인 교육이 필요하며, 선행 기관과 지속적 협력이 중요함 □ KEI 연구데이터 활용·관리 체계 정립 ○ 환경정책연구는 데이터 생산 사례가 적고, 사회·자연과학의 융·복합적인 연구 형태로 인해 과학기술계에서 운영 중인 DMP 및 연구데이터 리포지터리 시스템을 적용하는 데 한계가 있음 ○ 주요 기관의 데이터 분류 현황을 토대로 KEI 연구데이터는 데이터 종류 및 형식과 데이터 생산 방법에 따라 분류함 - 데이터 종류 및 형식(지표·지수, 정책 DB, 측정·관측, 시뮬레이션, 문헌, 전문가의견, 발표자료·정책문서, 기타 등) - 데이터 생산 방법(내부-생산, 내부-가공, 외부-생산, 외부-가공 등) ○ KEI는 연구데이터의 유실 방지 및 보존, 지속가능한 환경정책 수립, 데이터 연계를 통한 다학제 간 융·복합 연구체계 마련, 증거 기반의 정책 의사결정 지원 등을 위해 연구데이터의 체계적인 관리가 필요함 - 데이터 성과 관리를 통한 연구성과 관리 효율화와 연구성과 확산 제고, 데이터 기반의 연구 협력 생태계 조성을 위해 2022년 기본과제 제안 시(2021년 6월 시행) DMP를 도입함 - 원내 최초로 적용된 DMP를 효율적으로 운영하고자 연구데이터는 환경(정책) 연구 과정에서 활용된 자료 또는 결과로 나타난 주요 연구 산출물로 정의함 ○ 본 연구에서는 연구데이터 리포지터리를 구축하고 인트라넷 로그인 연동, 메타데이터 등록, DMP-IDR 연계 방안 마련 등을 통해 DMP 중심 데이터 관리체계를 마련함 Ⅳ. KEI 기후환경 데이터 플랫폼 구축전략 □ KEI 기후환경 데이터 플랫폼 구축 개요 ○ 데이터 플랫폼 구축을 통해 다양한 연구데이터를 공유하고 활용하고자 노력 중이며, 국가 차원의 연구데이터 활용 촉진과 융합연구 및 오픈 사이언스 등 선진 연구환경을 조성함 ○ 데이터는 다양한 분야에서 기하급수적으로 생산되고 있으나 이에 대한 소유권 문제, 정보공개 문제 등이 여전히 산재해 있음 - 데이터를 융·복합적으로 활용하고 각계각층에서 공동 활용하기 위한 법·제도는 미흡한 실정임 - ‘데이터 3법’과 「데이터기반행정법」 등의 개정 및 시행으로 데이터 산업 활성화 기반이 마련되고 있으나 중복 규제 등의 문제가 발생할 가능성이 큼 ○ 본 연구에서는 기후환경 연구데이터 관리와 플랫폼 구축전략 마련을 위해 정보 접근성 및 서비스 측면과 연구데이터 관리 측면에 관한 법·제도 현황을 정리함 - 데이터 이용 및 활용에 관한 주요 법·제도로는 「환경정책기본법」, 「전자정부법」, 「국가정보화기본법」, 「공공데이터의 제공 및 이용 활성화에 관한 법률」, 「지능정보화 기본법」, 「데이터기반 행정법」, 「정보통신융합법」, 「국가연구개발사업의 관리 등에 관한 규정」, 「정보통신망법」 등이 있음 ○ 먼저 검토가 필요한 사항인 연구데이터 관리 측면의 데이터 이용 및 활용에 관한 법·제도 개선 필요사항을 도출함 - 「국가연구개발사업의 관리 등에 관한 규정」에 (연구)데이터 관리 권고 조항 추가와 「데이터기반 행정법」에 기관메타시스템 및 IDR 시스템 구축 권고가 필요함 □ KEI 기후환경 데이터 플랫폼 구축전략 수립 ○ 현재 빅데이터 플랫폼 사업이 활발히 진행되고 있으나 다수의 플랫폼에서 기후환경정책 연구 수행에 활용 가능한 데이터 획득에는 여전히 어려움이 있음 ○ 유사 사업과 차별성을 두고 다양한 플랫폼과 연계 방안을 마련하고자 환경 분야 연구를 수행한 각 매체의 전문가들을 대상으로 설문조사를 실시함 - 기후환경 데이터 플랫폼 전략 수립과 데이터 기반의 환경정책연구를 발굴, 향후 KEI형 데이터 플랫폼으로 확장을 위한 거시적 관점의 전략 수립을 위한 기초 자료를 수집함 - 설문은 데이터 이용 및 활용, KEI 기후환경 데이터 플랫폼 구축, 데이터 기반 정책연구 수요 등 3가지 주제로 구분하여 진행함 ㆍ 기후환경 데이터 활용 목적 및 애로사항 유무, 데이터 품질 요소 및 특성에 관한 설문과 향후 플랫폼에서 제공해야 할 데이터 및 서비스와 기타 제안사항 등 의견수렴을 통해 향후 플랫폼 구축 방향성을 수립함 ○ 기후변화 대응을 위한 기후환경 데이터 현황조사, 연구데이터 관리체계 마련 등을 통해 환경 분야 정책연구의 데이터 활용·연계의 ‘통로’ 역할을 수행하기 위한 전략을 수립함 - 연구 간 융합연구 수행 및 시너지 효과 창출을 위한 전략과 지속가능한 정책연구 수행 등 핵심 가치 창출 요구에 대응하고자 로드맵을 마련함 - KEI 연구데이터 활용·관리 로드맵(안)은 데이터 관리 및 활용을 위한 ① 기후환경 데이터 허브 구축, ② 기후환경 데이터 활용체계 전환, ③ 데이터 활용제도 개선 등 목표를 크게 세 가지로 설정하고, 세부 추진 필수요소를 도출함 ㆍ 기후환경 데이터 허브 구축(인프라 구축, 주요 데이터 연계) ㆍ 기후환경 데이터 활용체계 전환(환경데이터 협업 네트워크 구축, 참여형 환경정책을 위한 데이터 체계 구축, 데이터 활용체계 구축) ㆍ 데이터 활용제도 개선(데이터 활용제도 개선, 데이터 관리체계 적용, 데이터 관리 고도화) ※ KEI 연구데이터 활용·관리 로드맵의 세부 내용은 <그림 4-14>~<그림 4-16> 참조. Ⅴ. KEI 기후환경 데이터 제공 서비스 구축 □ KEI 기후환경 데이터 제공 서비스 개요 ○ 기후환경 데이터 인벤토리를 기반으로 데이터 제공 서비스 방안을 마련하고, 기후환경 정책 이슈에 대한 의사결정 지원을 위한 서비스를 제공하고자 함 - KEI에서 기존에 구축한 데이터와 타 기관에서 제공하는 기후환경 관련 플랫폼 데이터로 범위를 설정하고, 이를 토대로 기후환경 데이터 제공 서비스 방안을 마련함 ○ 본 연구는 키워드 중심으로 정책과 데이터를 연계하여 정책연구 시 데이터 활용과 접근성이 개선되도록 하는 방안을 제시함 - 분야별 키워드 선정의 다양화를 통해 사용자에 대한 맞춤형 데이터를 제공하는 방안을 제시함 □ KEI 기후환경 데이터 제공 서비스 방안 ○ 연구데이터를 연구자들이 효율적으로 활용하기 위한 실질적인 방안이 필요하며, 주요 데이터의 메타정보 정리를 통해 서비스를 제공하는 방안을 마련함 ○ 본 연구는 원내외 기후환경 데이터의 현황분석을 통해 DMP를 작성하고, 이를 연계하여 메타데이터 작성 및 DB화하여 연구데이터 리포지터리 시스템에 시범적으로 제공함 - 다양한 기후환경 데이터의 정보 제공을 통해 정책문제 이해 및 의사결정의 근거로 활용하도록 함 - 데이터의 정책 활용성을 높이려면 메타데이터에 데이터의 종류, 매체 정보, 연관 키워드를 포함하여 제공하도록 함 ○ 기후환경 데이터의 정책 활용을 높이기 위한 검색 서비스 마련을 위해 관련 키워드를 저장한 키워드 사전 및 관련 알고리즘을 구축함 ○ 본 연구는 기후변화 데이터에 대한 접근성을 높이기 위해 카테고리 및 키워드 빈도수를 중심으로 데이터를 분류하여 제공하는 마인드맵 서비스를 제안함 - 마인드맵 형태로 데이터를 제공할 때는 ‘검색어’를 중심으로 연관된 데이터를 추출하고, 이를 카테고리별로 분류하여 제공함 ㆍ 데이터명, 데이터 키워드, 데이터 설명, 데이터 원자료명 등 메타데이터를 검색키워드가 연결하여 마인드맵을 구성할 데이터 범위를 우선적으로 선별함 ㆍ 1차 분류 기준은 기후변화 적응 부문, 2차 분류는 부문별 세부주제로 설정하고, 3차 분류는 데이터에 포함된 키워드 빈도수를 중심으로 묶어 제공하는 방식의 마인드맵을 구성함 □ 기후환경 정책-데이터 연계 서비스 방안 ○ 다양한 경로로 데이터 연관 키워드를 충분히 부여하여 연결고리를 만드는 방안을 제시함 - KEI 원내 보고서 수집을 통한 연관 키워드 부여와 주요 환경 이슈별로 활용되는 데이터에 키워드를 부여함 - 텍스트 데이터의 내용 및 성격 등에 따라 키워드 관리 범위 설정이 필요하며, 정책공급자 또는 수요자 입장의 텍스트로 범위를 설정하고 관련 키워드를 부여하는 방안을 고려함 Ⅵ. 결론 및 정책 제언 □ 결론 ○ 본 연구는 기후환경 분야의 데이터 사이언스(Data Science) 대응 플랫폼 전략 구축을 통해 데이터에 기반하여 기후변화 대응을 강화하고, 디지털 전환의 기틀을 마련하기 위한 시범 연구임 ○ 1차 연도 연구에서는 분야별 기후변화 적응에 활용이 가능한 KEI 및 주요 외부기관의 데이터 현황을 조사하고, 기후변화 취약성 평가에 활용하기 위한 추가적인 데이터를 제안함 ○ 2차 연도 연구에서는 기존 적응데이터와 함께 최근 기후변화 연구에 활용도가 높은 응용데이터인 위성 데이터 내용을 포함함 ○ 또한 온실가스 감축 및 기후변화 완화에 활용할 수 있는 산업·수송·가정 등의 분야와 관련된 데이터를 추가로 조사하고, 이를 통합하여 인벤토리를 구축함 ○ 기후환경 데이터 인벤토리 구축을 통해 기후변화 대응을 위한 연구수행 시 관련 데이터를 효율적으로 제공하여 데이터 활용성을 높일 수 있을 것으로 기대됨 ○ 기후환경 데이터를 중심으로 구축한 연구데이터 관리체계를 보완하고, 데이터 관리 및 수집을 위해 타 기관 사례를 조사하여 기본적인 요소들로 연구데이터 관리체계 초안을 작성함 ○ 원내 연구데이터 활용 사례, 데이터 범위 및 DMP 양식 구축 사례 검토, 연구 수행 프로세스 등을 고려하여 DMP 중심의 연구관리체계를 마련함 ○ 본 연구에서는 기후환경 데이터 관리·활용을 위한 DMP 마련과 메타데이터 템플릿 구축 및 보완, DMP 및 연구데이터 제출 프로세스, 데이터 형태 등을 고려하여 KEI에서 실질적으로 활용 가능한 형태의 데이터 관리 실행체계를 구축함 ○ 특히 기후변화 대응 정책 중 하나로 적응 분야 연구 지원을 위해 각 데이터에 대한 부문별 세부 주제를 설정하고, 관련 키워드, 데이터 설명, 데이터 출처 등 해당 데이터의 정보를 제공하기 위한 메타데이터를 구축함 ○ 데이터 기반의 정책 지원을 위해서는 어떤 문서를 기반으로 키워드를 설정할 것인가에 대한 고민이 필요하며, 언론, 정책 관련 문서 등 관련 이슈 및 중요사항을 파악할 수 있는 텍스트를 설정하는 것이 핵심이라 할 수 있음 ○ 기후변화에 대응하고자 기후환경 데이터의 현황을 분석하고 연구데이터 관리 및 실행체계를 마련하였으며, 실제 데이터를 어떻게 제공할 수 있는지를 현실적인 접근 전략으로 제시함 □ 연구의 한계점 및 보완사항 ○ 장기적으로 환경 분야 전체를 포괄하는 뛰어난 플랫폼 구축과 함께 다양한 사용자의 요구를 수용할 수 있는 데이터세트 구축이 필수적임 ○ 데이터 기반 정책연구를 실현하려면 정책연구에 실질적으로 활용 가능한 데이터가 무엇이고, 이를 어떻게 구축할 것인지를 깊이 있게 고민하고 연구수행 결과를 데이터화하여 의미 있는 성과물로 관리하는 노력이 지속적으로 필요함 ○ KEI 기후환경 데이터 플랫폼을 구축하는 로드맵을 마련했으나 이 로드맵을 이행하는 데는 많은 예산과 인력 등의 자원이 필수적이며, 데이터의 공유문화와 플랫폼이 필요하다는 공감대가 형성되어야 함 ○ 전반적인 환경정책연구에서 정책 수립 및 이행에 필요한 데이터세트 구축 사업을 활성화하는 것과 데이터 성과물 영역의 확대 및 구축된 데이터의 활용도를 높이기 위한 실질적인 데이터 협업체계를 마련하는 것이 필요함 Ⅰ. Introduction □ Necessity and purpose of the study ○ The intensification of climate change phenomena such as abnormal weather conditions and natural disasters affects not only the natural environment but also human activities in various ways. ○ Recently, Korea has pledged to reach net-zero emissions by 2050 in cooperation with the international community and has been actively responding to climate change. ○ Climate change can be divided into mitigation efforts to reduce greenhouse gas and adaptation efforts to minimize damage and risk. However, since climate change occurs due to multiple causes, complementary policies on both efforts are needed. ○ In 2018, the Ministry of Science and ICT established the “Strategy for Sharing and Utilization of Research Data” to manage and share research data accumulated during the promotion of national R&D projects, and the era of data-intensive science is coming in earnest. - The development of equipment such as hardware and high-performance networks has produced a great deal of research data, and the management of various research data is an essential element for effective research performance. ○ This study aims to promote the climate change response focusing on data, which means to convert climate change research into one that is based on data science. - “Data Science” is a generic term for the process of understanding actual phenomena and deriving useful knowledge from various types of data. - As more data such as data from information and communication technology (ICT), satellite data, and meteorological reinterpretation data are generated, it is important to secure relevant data to link and utilize them. ○ Data for climate change response are scattered across various organizations and there are many constraints in terms of data utilization due to the absence of a standard system for classifying environmental data. Thus, it is urgent to come up with measures to utilize data efficiently and conveniently. □ Scope of the study ○ Analyze the current status of climate environment data, establish and operate an implementation system of the data management plan (DMP), and prepare a strategy for establishing a climate environment data platform as well as a plan for providing differentiated data services - Prepare a data inventory in the mitigation & reduction sector to respond to climate change based on the status of satellite-centered climate environment application data and data survey in the climate change mitigation sector - Define the scope of research data and introduce a data management plan (DMP) for establishing a practical implementation system for climate environment data management, and establish the KEI-type data management promotion system centered on research data repository (IDR) - Seek ways to provide climate environment data services that can be used for environmental policy research based on the established climate environment data inventory and management system ㆍ Climate environment data is limited to public data related to mitigation and adaptation to respond to climate change in this project. ㆍ This study aims to prevent the use of various accumulated data in a single research project only and maximize the sharing and utilization of data. - Review the laws and systems related to data sharing and utilization, as well as surveys for establishing a climate environment data platform, and suggest realistic plans for policy research centered on climate change data □ Content and methodology ○ In the second year of the study, building and supplementing the greenhouse gas reduction data in major environmental organizations and promoting the advancement of the existing inventory are planned. - This study examines the possibility of expanding the scope of climate change response data based on the satellite data outputs and the status of meteorological and climate data collected by the Korea Meteorological Administration among climate environment data. ○ This study seeks to derive the essential elements for defining and managing research data for the preparation and establishment of the KEI research data management system. ○ This study aims to develop a roadmap for constructing the KEI climate environment data platform based on research data management systems investigated and expert opinions. ○ To expand the above platform and make it the KEI-type data platform in the future, we plan to upload the collected meta-information of climate change response data to the research data repository system on a trial basis, and improve policy utilization through the data mind map trial services. Ⅱ. Advancement of the Climate Environment Data Inventory □ Current status of domestic climate environment application data ○ Climate change greatly affects not only precipitation, cloud amount, and temperature, but also vegetation distribution and land distribution, and to respond to these, primary data needs to be secured. ○ Overseas, satellite data are being used for observation in various areas, such as air-polluting gases, climate change-causing gases, aerosols, and vegetation index changes, led by the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA). ○ In Korea, a geostationary complex satellite was launched following the Communications Oceanic and Meteorological Satellite (COMS), and the data produced through the satellite observation are used as basic data in developing climate change response policies. ○ Korea’s representative satellites include the Geostationary Ocean Color imgaer (GOCI), Chollian Satellite, Geostationary Korea Multi-Purpose Satellite-2B, and GEO-KOMSAT-2A. - The GOCI is used to monitor red tide, sea ice, sea fog, marine dumping, marine sand mining activities, fine dust, and so on. ㆍ As the major outputs, 13 types of data are produced, including the data on dissolved organic matter, chlorophyll, total suspended matter, red tide index, and terrestrial vegetation index. - Geostationary Satellite 2B observes the marine environment and ecosystem, monitors air pollutants outside the Korean Peninsula, and provides data for responding to climate change and monitoring fine dust. ㆍ A total of 26 types of data are provided, including atmospheric correction, unique optical characteristics, atmospheric data, sea color data, ocean data, and land data. - Compared to the GOCI, GEO-KOMSAT-2A Satellite is capable of various observations, and it is possible to monitor and prepare for meteorological disasters. ㆍ A total of 52 types of meteorological data are produced, 23 types of which being basic ones including cloud detection, ozone amount, and rainfall intensity, and 29 types being additional ones including forest fire detection, vegetation index, vegetation rate, and surface reflectance. □ Current status of climate change response data ○ Responding to climate change should be considered in terms of both mitigation policies (reducing or absorbing greenhouse gases) and adaptation policies (reducing damage from climate change). ○ This study aims to link climate change mitigation policies and adaptation policies by examining the current status of climate change mitigation data such as energy, power generation, and greenhouse gas emissions. ○ Data in the climate change mitigation (greenhouse gas reduction) sector can be largely divided into energy statistics, the national greenhouse gas inventory, and other related and usable data. - The Korea Energy Statistical Information System links and integrates statistics related to energy balance and national energy supply and demand, as well as statistical data from related organizations in accordance with the regulations on preparing energy statistics. - The national greenhouse gas inventory provides data to identify domestic greenhouse gas emission sources, sinks, and the amount of emissions and absorption to respond to climate change. - Other data include traffic/transport and electricity data to support the emission calculation and analysis in the public and private sectors as well as academic world and to link greenhouse gas inventories. ○ The climate change adaptation sector builds the climate environment data inventory based on the data established in the system operated by the National Climate Change Adaptation Center (KACCC). - Vulnerability assessment data for adaptation to climate change are provided by the integrated assessment model for climate change impact and vulnerability by sector (MOTIVE) and climate change vulnerability assessment tool (VESTAP), which are representative climate change adaptation systems. ○ Various weather observation data and disaster prevention meteorological information observed and provided by the Korea Meteorological Administration are used as basic data in various fields, such as in predicting the future of climate change and establishing response policies. - Climate change scenarios can be used for analysis in impact assessment due to future climate change and research on minimizing the damage, and it is used as essential information for establishing and supporting climate change response and adaptation measures. Ⅲ. Establishment of a Climate Environment Data Management System □ Overview of KEI research data management ○ Efforts to share and utilize domestic research data are actively pursued following the implementation of the regulations on Data Management Plan (DMP) in 2019. - There are successful cases of data preservation and reuse from national R&D projects in major advanced countries, and open data activities are spreading around the world. - Research data and data management plans are defined in the Regulations on the Management of National Research and Development Projects. The basis for managing research data at the national level is established by stipulating the requirement to submit DMP when conducting national R&D projects. ○ Core elements for managing and providing research data can be divided into supporting DMP preparation, data file organization, data storage, data sharing and access, data citation, and data management education. □ Research data collection and management ○ Research data is factual data calculated through various experiments, observations, investigations, and analysis conducted in the course of conducting R&D tasks, and is essential for the verification of research results. - Research data refers to all data generated in the research process, so it must be distinguished from research records such as e-mails or technical reports. - In order to support continuous research activities and to preserve and share research results, it is necessary to manage the data generated during the research process in the research institute to which the researcher belongs and the community in which the researcher is active. ○ DMP refers to an official document describing how research data produced and collected through a research project is managed and shared during or after the research project. - DMP allows faithful data description before data collection, which eliminates the need for researchers to make efforts to memorize details about data and allows data reuse. ○ Research data management is an act that occurs in all processes, from the research planning stage to data production, collection, management, preservation and disposal, publication, and reuse in accordance with the research life cycle. ○ For the development of the KEI climate environment data platform, it is important to identify the research data life cycle and confirm the details. □ Establishment of a research data management system ○ Data repository is developed as an open source and can utilize open software, and DSpace and NaRDA are representative examples. - DSpace makes it easy to submit files and accommodates a variety of files by providing a web-based interface, and it can be expanded to a large-scale, multi-disciplinary repository beyond one institution. - NaRDA is a research data repository developed and disseminated by the Korea Institute of Science and Technology Information (KISTI), designed and implemented in consideration of the cycle of researchers’ data management activities. ㆍ On NaRDA, users can fill out DMP submission forms, post and share them. ㆍ In the stage of conducting research, data for research can be freely uploaded and downloaded, and data description can be provided. ㆍ In the last stage, research data can be registered for sharing research results, and for this purpose, metadata extraction and DOI grant functions are provided. ○ Research data consists of metadata and source data, and metadata is an index element used in data retrieval systems to describe data. ○ Metadata describes the properties of data, provides context and data quality information, and it refers to the documentation of the characteristics of other objects or data. □ Preservation and sharing of research data ○ When digital research data is preserved, various benefits can be generated and preservation of data requires human and material resources. - For data preservation, a method to collect the relevant information should be systematically provided, and there should be given a permanent identifier (DOI, ARK, UUID, etc) for preservation and publication. ㆍ The most commonly used permanent identifier is DOI, and organizations that publish data using the DOI prefix issued by KISTI can publish data by adding a suffix. ○ In relation to data publication, it is necessary to reflect the will of the researcher, and it is also necessary to set the ranges for internal and external sharing and design frames and functions that can express the level of the researcher’s demand. ○ Effective use, sharing, and reuse of research data can be expected by improving the data literacy ability of researchers, and a compensation system for data disclosure should be prepared to promote research data sharing and reuse. □ Research data construction service cases and implications ○ The Korea Institute of Geoscience and Mineral Resources (KIGAM) has been operating GDR, a geological data repository, recognizing that duplicate research is being conducted due to the absence of a management system for research data. - GDR is the first repository that issues DOIs for data access control functions and data linked to external data, and implements a system that includes the DMP form in the business plan. ○ The Korea Institute of Oriental Medicine (KIOM) has been operating the oriental medicine research data repository (KMDR). - KMDR is an information system for systematic management and sharing of research data in the field of oriental medicine, established for the purpose of supporting efficient research performance through data management support and enhancement of utilization. - By linking the encryption module application for the protection of research data from external threats and DMP creation and management functions, it is possible to manage the entire period of research data. ○ The National Institute of Forest Science newly established a provision on the obligatory management of research data through a partial revision (February 11, 2019) of the “Regulations on Research Project Management of the National Institute of Forest Science (Regulation No. 307).” - It established a reward plan to promote active research data management and raise awareness on the participation to support data-based convergence forest science research. ○ In order to create a research data management and governance system, continuous education is required so that the perception of researchers and management can change in a positive way, and continuous cooperation with leading institutions is important. □ Establishment of the KEI research data utilization and management system ○ There are few data production cases in environmental policy research, and applying the DMP and research data repository system operated in the science and technology field has limitations due to the integrated and complex format of social and natural science research. ○ Based on the data classification status of major institutions, KEI research data is classified according to types and formats of data and data production methods. - Types and formats of data (indicator/index, policy database, measurement /observation, simulation, literature, expert opinion, presentation materials /policy document, etc) - Data production methods (internal-produced, internal-processed, external -produced, external-processed, etc) ○ KEI needs to systematically manage research data in order to prevent its loss, establish sustainable environmental policies, prepare a multidisciplinary convergence research system through data linkage, and support evidencebased policy decision-making. - DMP was introduced (implemented in June 2021) when research projects for 2022 were proposed to improve the efficiency of research performance management through data performance management, facilitate the dissemination of research results, and create a data-based research cooperation ecosystem. - For the efficient operation of the first DMP applied in the institute, research data is defined as data used during the environmental (policy) research process or major research outcomes. ○ This study aims to prepare a DMP-centered data management system by establishing a research repository, linking intranet accounts to the repository, registering metadata, and preparing plans to connect DMP and IDR. Ⅳ. KEI Climate Environment Data Platform Construction Strategy □ Overview of KEI climate environment data platform construction ○ Efforts are underway to share and utilize various research data through the establishment of a data platform, promoting the use of research data at the national level, and creating an advanced research environment for convergence research, open science, and so on. ○ Data is being produced exponentially in various fields, but there are still issues related to the ownership and disclosure of information. - There is a lack of appropriate legislation in terms of integrating and using data and using them jointly across sectors. - The foundation is being laid for data industry revitalization through the revision and enforcement of the “Three Data Privacy Acts” and the “Act on the Promotion of Data-based Administration” but there is a high possibility of problems such as overlapping regulations occurring. ○ This study summarizes the current status of laws and systems related to information access and services and research data management to manage climate environment research data and prepare a platform construction strategy. - The main laws and systems related to data use and utilization include the Framework Act on Environmental Policy, Electronic Government Act, Framework Act on National Informatization, Act on Promotion of the Provision and Use of Public Data, Framework Act on Intelligent Informatization, Act on the Promotion of Data-based Administration, Special Act on Promotion of Information and Communications Technology and Vitalization of Convergence Thereof, Regulations on the Management of National Research and Development Projects, and the Act on Promotion of Information and Communications Network Utilization and Information Protection, etc. ○ First, we drew out the requirements for improvement from the current laws and systems related to the use and utilization of data in terms of research data management. - It is necessary to add a clause recommending (research) data management to the Regulations on the Management of National Research and Development Projects and to recommend in the Act on the Promotion of Data-based Administration that institutions establish metasystems and IDR systems be constructed. □ Establishment of the KEI Climate Environment Data Platform Construction Strategy ○ Currently, big data platform projects are being actively carried out, but there are still difficulties in acquiring data that can be used for climate environment policy research on multiple platforms. ○ A survey was conducted among experts in various media who conducted environmental policy research in order to establish a plan to connect various platforms differentiated from those of similar projects. - We collected basic data to develop macroscopic strategies for building a climate environment data platform construction strategy, planning data-based environmental research projects, and expanding the established platform in the future to make it the KEI-type data platform. - The questionnaire was divided largely into three main themes: data use and utilization, KEI climate environment data platform establishment, and data-based policy research demand. ㆍ The direction for construction is set based on the purpose of using climate environment data, whether there are any difficulties, data quality factors and characteristics, the opinions collected on data and services that should be provided in the platform, and other suggestions. ○ We established strategies that can make the platform serve as a “channel” for data utilization and linkage in environmental policy research by conducting a survey on the current status of climate environment data to respond to climate change and preparing a research data management system. - A roadmap to respond to the demands for creating core values was prepared, such as conducting convergence research, conducting strategic research for synergy, and conducting sustainable policy research. - The KEI research data utilization and management roadmap (draft) sets three main goals for data management and utilization and elicits detailed essential elements for promotion: ① establishment of a climate environment data hub, ② conversion of the climate environment data utilization system, and ③ improvement of the data utilization system. ㆍ Establishment of a climate environment data hub (builing an infrastructure, data linkage) ㆍ Conversion of the climate environment data utilization system (building an environmental data collaboration network, a data system for participatory environmental policy, and a data application system) ㆍ Data utilization system improvement (data utilization system improvement, data management system application, data management advancement) ※ For details of the KEI research data utilization and management roadmap, refer to < Figure 4-14 >~< Figure 4-16 >. Ⅴ. Establishment of the KEI Climate Environment Data Provision Services □ Overview of the KEI Climate Environment Data Provision Services ○ Based on the climate environment data inventory, we intend to prepare a data provision service plan and provide services to support decision-making on climate environment policy issues. - We set the scope with the data established by KEI and the platform data on the climate environment provided by other organizations, through which we prepared a plan for providing climate environment data. ○ In this study, we propose a method to improve data utilization and accessibility in policy research by preparing keyword-oriented policies and data linkage plans. - Providing customized data to users by diversifying keywords by field is proposed. □ Plan for providing KEI climate environment data ○ It is necessary to come up with a practical plan for researchers to use research data efficiently, and to provide a service by organizing the meta-information of major data. ○ In this study, the DMP is prepared based on the analysis of the current status of the climate environment data inside and outside the institute, which is connected to create metadata and provided as the pilot data on the research data repository system. - We provide various climate environment data that can be used as a basis for understanding policy issues and making decisions. - In order to improve the utilization of data in policy making, types of data, media information, and related keywords should be included in the metadata. ○ In order to develop a search engine to enhance the policy utilization of climate environment data, a keyword dictionary and related algorithms were built with related keywords stored. ○ To enhance access to climate change data, we propose a service in the form of a mind map that classifies and provides data by category and keyword frequency. - When providing a mind map, related data is extracted centered on ‘search words’ and classified by category. ㆍ Metadata such as name, keyword, and description of data as well as name of data source are matched with search keywords to preferentially select the range of data that will be used in mind mapping. ㆍ The criteria for primary and secondary classification are the climate change adaptation sector and sub-categories by sector, respectively. The tertiary classification consists of a mind map that provides keywords in bundles based on the frequency included in the data. □ Plan for providing a service linking climate environment policy with data ○ We suggest establishing links by giving sufficient amounts of relevant keywords to data through various routes. - Extract relevant keywords from KEI research reports and assign them to each database of major environmental issues - It is necessary to set the scope for keyword management depending on the content and nature of text data, and consider setting the limits to texts reflecting the positions of policy providers or demanders and assigning related keywords. Ⅵ. Conclusion and Policy Recommendations □ Conclusion ○ This study is a pilot study to strengthen data-centered responses to climate change and lay the foundation for digital transformation by establishing a data-science response platform strategy in the field of climate environment. ○ In the first-year study, the current status of data from KEI and other major organizations that can be used for climate change adaptation by sector was investigated, and additional data that can be used in climate change vulnerability assessment was proposed. ○ The second-year study includes the satellite data which are widely used in recent climate change research along with the existing adaptation data. ○ In addition, data in sectors such as industry, transportation, and household that can be used for greenhouse gas reduction and climate change mitigation are additionally investigated, and an inventory is built by integrating them. ○ It is expected that data utilization will be improved with the establishment of a climate environment data inventory with which it is possible to efficiently provide relevant data when conducting research on climate change response. ○ The draft of the research data management system was prepared with basic elements by supplementing the research data management system which focuses on climate environment data and investigating case studies on data management and collection by other institutions. ○ A DMP-centered research management system was prepared in consideration of in-house research data utilization cases, data scope, the review of DMP format construction cases, and research promotion process. ○ This study establishes a data management system that can be practically used in KEI in consideration of DMP preparation for climate environment data management and utilization, metadata template construction and supplementation, DMP and research data submission process, data format, and so on. ○ In particular, as an example of climate change response policy, detailed topics for each data sector are set to support research in the field of adaptation, and metadata is established to provide information on the data such as related keywords, data descriptions, and data sources. ○ To support for data-based policy, it is necessary to think about which document we should choose to extract keywords and the key is to set texts based on which we can identify related issues and important matters such as media reports and policy-related documents. ○ It analyzes the current status of climate environment data to respond to climate change, prepares a research data management and execution system, and presents a realistic approach strategy for how to provide actual data. □ Limitations and points for improvement ○ In the long term, it is necessary to make an effort to build a data set that can accommodate the needs of various users along with building an excellent platform that covers the entire environmental field. ○ In order to realize data-based policy research, in-depth consideration on what data can be practically used for policy research and how to build it, and continuous efforts to manage research results in databases and preserve them as meaningful outcomes are required. ○ To implement the roadmap for developing the KEI climate environment data platform, a large amount of resources including budget and human resources are essential; also, the implementation should be based on the consensus on the need for a data-sharing and a platform for it. ○ In overall environmental policy research, it is necessary to prepare a practical data cooperation system to expand the area of data outcomes and increase the utilization of the established data as well as promoting data set establishment projects necessary for policy making and implementation.