생성형 언어모델을 활용한 형사법무분야 성과확산 방안 연구|RISS 상세보기

국문 초록 (Abstract)

이 연구는 생성형 언어모델을 활용하여 형사·법무 분야 연구 성과를 확산하고 공공 데이터 활용성을 향상하기 위한 목표로 추진되었다. 최근 인공지능 기술의 급속한 진보는 경제, 사회, 법률 등 광범위한 영역에서 혁신을 일으키고 있으며, 그에 따라 데이터 기반 행정과 연구의 중요성도 커지고 있다. 특히 형사·법무 분야는 방대한 데이터를 신속하고 정확하게 분석해 정책에 반영해야 하는 특성상 전문성과 책임성이 동시에 요구되어, 생성형 언어모델을 도입하는 연구가 국가적 차원에서 의미 있는 진전을 가져올 수 있을 것으로 기대된다.
이번 연구는 미국과 중국 등 인공지능 선도국 사례를 통해 연구분야와 관련한 최근 인공지능 동향을 살펴보고, 국내 연구기관을 중심으로 적용 가능한 몇 가지 시사점을 도출하였다. 그리고 연구보고서를 인공지능의 학습에 활용할 수 있는 개방데이터를 실제로 구현하기 위해 그동안 발간된 KICJ 연구보고서를 기반으로 데이터를 증강하고 평가하였다. 이러한 과정을 통해 언어모델 학습에 활용할 수 있는 연구기관 보고서를 개방형 데이터로 전환하는 방법론을 제시하였다.
데이터 증강 과정에서는 청킹(chunking) 기법과 데이터 증강 기술을 활용하여 연구보고서의 문장 구조를 유지하며 분석의 효율성을 극대화하였다. 새로운 방법론을 통해 도출한 데이터는 정량적·정성적 평가를 통해 그 품질을 검증하였다. 이렇게 생성된 Q/A 데이터는 연구보고서를 활용한 성과 확산의 새로운 접근방법을 제시하며, 정책연구의 활용도 제고는 물론 행정의 효율성 향상에도 기여할 수 있다.
이 연구는 단순히 연구 성과를 정리하는 데 그치지 않고, 학습용 언어모델을 활용해 형사·법무 영역뿐만 아니라 다른 공공 및 민간 분야에도 적용할 수 있는 방법론의 모델을 제시했다는 점에서 의미가 크다. 이러한 방법론은 국내 AI 산업의 경쟁력을 강화하고 한글 기반 AI 생태계를 풍부하게 만들기 위한 기초를 다지는 데 도움이 될 것으로 여겨진다. 이번 연구가 연구보고서를 활용한 데이터 품질 개선을 통해 형사·법무 분야의 데이터 기반 연구의 환경 조성에 기여할 수 있기를 기대한다.

번역하기

이 연구는 생성형 언어모델을 활용하여 형사·법무 분야 연구 성과를 확산하고 공공 데이터 활용성을 향상하기 위한 목표로 추진되었다. 최근 인공지능 기술의 급속한 진보는 경제, 사회, 법...

다국어 초록 (Multilingual Abstract)

Leveraging generative language models to disseminate achievements of the Korean Institute of Criminal Justice

Park Seong-Hoon, Lim Jung Ho, Hong Wonshin, Kim Jinhong

This research aimed to disseminate achievements in the criminal and legal fields and enhance the usability of public data by leveraging generative language models. Recent advancements in artificial intelligence technology are driving innovation across various domains, including the economy, society, and law, while highlighting the growing importance of data-driven administration and research. In particular, the criminal and legal fields demand both expertise and public accountability due to the need for swift and accurate analysis of large data sets to inform policy decisions. Consequently, research integrating generative language models is expected to yield meaningful progress at the national level.
This study analyzed recent trends in artificial intelligence within research domains, using case studies from leading AI nations such as the United States and China. It also identified implications applicable to Korean research institutes. To facilitate the development of open data suitable for AI training, we augmented and evaluated data sets based on previously published KICJ research reports. Through this process, we proposed a methodology for converting institutional reports into open data usable for language model training.
During data augmentation, chunking techniques and data augmentation methods were employed to preserve the sentence structure of the research reports while maximizing analytical efficiency. The resulting data was subjected to both quantitative and qualitative evaluations to ensure its quality. The Q&A data sets generated through this methodology offer a novel approach to disseminating research findings and can enhance the usability of policy research and administrative efficiency.
This research is noteworthy for not only summarizing research findings but also presenting a methodology applicable to other public and private sectors beyond the criminal and legal fields. By utilizing language models for learning, this approach is expected to bolster the competitiveness of Korea's AI industry and contribute to the development of a vibrant Korean language-based AI ecosystem. We hope this study will foster an environment for data-driven research in the criminal and legal fields by improving the quality of research data.

번역하기

목차 (Table of Contents)

국문요약 ···················································································· 1
제1장 박성훈
서 론 ···························································································· 3
제1절 연구 배경 및 필요성 ········································································· 5
제2절 연구 목적 및 연구 범위 ···································································· 8

국문요약 ···················································································· 1
제1장 박성훈
서 론 ···························································································· 3
제1절 연구 배경 및 필요성 ········································································· 5
제2절 연구 목적 및 연구 범위 ···································································· 8
제3절 보고서의 구성 ················································································ 10
제2장 임정호·김진홍
인공지능 활용 동향과 기술적 이해 ··········································· 11
제1절 주요국 인공지능 개발 및 데이터 학습 현황 ···································· 13
제2절 연구의 기술적 구성 요소 및 용어 정의 ··········································· 29
제3절 데이터 구축을 위한 연구 설계 ························································ 35
제3장 홍원신·박성훈·임정호
자료의 수집 ··············································································· 37
제1절 연구 데이터 선정 ············································································ 39
제2절 RAG 기반 Q/A 데이터 수집 ··························································· 43
제3절 ICL 기반 데이터 증강 ····································································· 52
제4장 김진홍·박성훈·홍원신·임정호
데이터 품질 평가 ······································································· 55
제1절 개요 ······························································································· 57
제2절 데이터 세트 정량 평가 ··································································· 58
제2절 데이터 세트 정성 평가 ··································································· 63
제5장 박성훈·홍원신
결 론 ·························································································· 75
제1절 결과의 요약 ···················································································· 77
제2절 정책 활용 방안 ··············································································· 80
참고문헌 ···················································································· 84
Abstract ····················································································· 92

상세검색

RISS 보유자료

상세검색

해외전자자료

생성형 언어모델을 활용한 형사법무분야 성과확산 방안 연구

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료