의료 보험 비용 예측을 위한 회귀 모델 비교 평가 = Comparative Evaluation of Regression Models for Medical Insurance Cost Prediction|RISS 상세보기

국문 초록 (Abstract)

회귀 모델은 예측 분석에서 중요한 역할을 수행하며, 특히 의료 비용 추정과 같은 분야에서 그 활용 가치가 크다. 그러나 적절한 회귀 기법을 선택하는 과정에서는 예측 정확도, 모델의 복잡성, 그리고 일반화 성능 간의 균형을 고려해야 한다. 본 연구는 의료 보험 데이터셋을 활용하여 선형 회귀, 릿지 회귀, 결정 트리 회귀, 랜덤 포레스트, 서포트 벡터 회귀, 그리고 그래디언트 부스팅 등 다양한 회귀 접근법을 비교 분석하였다.
제안된 연구 방법론은 데이터 전처리, 특성 인코딩, 하이퍼파라미터 튜닝, 그리고 k-겹 교차 검증을 활용한 모델 평가 과정을 포함한다. 모델 성능 평가는 평균 제곱 오차(MSE), 평균 제곱근 오차(RMSE), 평균 절대 오차(MAE), 그리고 결정 계수(R²)를 기준으로 수행하였으며, 과적합 및 과소적합 현상에 대한 분석에도 중점을 두었다.
실험 결과, 전통적인 회귀 모델은 높은 해석 가능성을 제공하는 반면, 앙상블 기반 머신러닝 모델은 예측 성능과 일반화 능력 측면에서 우수한 결과를 보였다. 이러한 결과는 데이터 특성과 실제 적용 환경을 고려한 합리적인 모델 선택의 중요성을 시사한다. 향후 연구에서는 딥러닝 기법이나 하이브리드 모델링 접근법을 적용하여 본 연구를 확장할 수 있을 것으로 기대된다.

번역하기

회귀 모델은 예측 분석에서 중요한 역할을 수행하며, 특히 의료 비용 추정과 같은 분야에서 그 활용 가치가 크다. 그러나 적절한 회귀 기법을 선택하는 과정에서는 예측 정확도, 모델의 복잡...

다국어 초록 (Multilingual Abstract)

Regression models play an essential role in predictive analytics, particularly in healthcare-related cost estimation. However, selecting an appropriate regression technique involves balancing predictive accuracy, model complexity, and generalization ability. This study compares multiple regression approaches, including Linear Regression, Ridge Regression, Decision Tree Regression, Random Forest, Support Vector Regression, and Gradient Boosting, using a medical insurance dataset.
The proposed methodology includes data preprocessing, feature encoding, hyperparameter tuning, and model evaluation using k-fold cross-validation. Model performance is assessed using Mean Squared Error, Root Mean Squared Error, Mean Absolute Error, and R² score, with particular attention paid to overfitting and underfitting behavior.
Experimental results indicate that while traditional regression models offer strong interpretability, ensemble-based machine learning models achieve superior predictive performance and generalization. These findings support informed model selection based on dataset characteristics and practical constraints. Future work may extend this study by incorporating deep learning or hybrid modeling approaches.
Keywords: Regression Models, Medical Insurance Cost Prediction, Machine Learning, Overfitting, Model Evaluation, Cross-Validation

번역하기

목차 (Table of Contents)

List of figures v
List of tables vi
List of Abbreviations vii
Acknowledgments viii
1. Introduction 1

List of figures v
List of tables vi
List of Abbreviations vii
Acknowledgments viii
1. Introduction 1
1.1 Background and Motivation 1
1.2 Problem statement 3
1.3 Research Objectives and Questions 5
1.3.1 Research Objectives 5
1.3.2 Research Questions 6
1.4 Scope and Limitations of the Study 7
2. Literature Review 9
2.1 Overview of Regression Models 9
2.2 Traditional regression techniques 10
2.3. Advanced machine learning regression models 11
2.4 Comparison of regression models 13
2.5 Challenges in regression model optimization 15
3. Methodology 16
3.1 Data collection and Preprocessing 16
3.1.1 Data Collection 16
3.1.2 Data Preprocessing 17
3.2 Model training and optimization techniques 18
3.3 Evaluation metrics for regression models 20
3.4 Cross-validation and error analysis 21
3.5 Prediction and Analysis 22
4. Implementation and experimental setup 24
4.1 Description of dataset 24
4.2 Experimental setup and tools 27
4.3 Model training and Testing 29
4.4 Performance analysis of different regression models 38
4.5 Streamlit Application for Insurance Premium Prediction 41
4.5.1 Purpose and Features 41
4.5.2 Workflow and Implementation 41
4.5.3 User Interface 42
5. Justification of Models 44
6. Conclusion 46
Summary of Findings 46
Contributions of the Research 47
Practical Implications 47
Limitations of the Study 47
Future Research Directions 48
References 49
Abstract 52
국 문 요 약 54

상세검색

RISS 보유자료

상세검색

해외전자자료

의료 보험 비용 예측을 위한 회귀 모델 비교 평가 = Comparative Evaluation of Regression Models for Medical Insurance Cost Prediction

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료