분류 머신러닝을 활용한 잉글랜드 프리미어리그 승패예측모형 탐색|RISS 상세보기

국문 초록 (Abstract)

이 연구의 목적은 잉글랜드 프리미어리그 경기기록과 배당률 자료를 활용하여 분류 머신러닝 알고리즘 기반 축구 승패예측모형을 탐색 및 비교하는 것이다. 이 연구의 목적을 달성하기 위한 연구내용은 크게 세 부분으로 나누어진다. 첫 번째 연구내용은 ‘경기기록을 활용한 분류 머신러닝 기반 승패예측모형 탐색 및 비교’이다. 이를 위해 잉글랜드 프리미어리그 09-10시즌부터 18-19시즌 3,800경기에 대한 경기기록 자료를 잉글랜드 프리미어리그 공식 사이트(www.premierleague.com)와 후스코어드닷컴(whoscored.com)에서 수집하였으며, 분석에 사용된 변인 44개를 선정하였다. 입력 변인은 예측하려는 경기의 이전 경기 평균값을 사용하였으며, 효과적인 조합을 알아보기 위하여 이전 1~5경기에 대해서 모두 실험하고 비교하였다. 두 번째 연구내용은 ‘배당률을 활용한 분류 머신러닝 알고리즘 기반 승패예측모형 탐색 및 비교’이다. 이를 위해 football-data에서 경기 배당률을 수집하였으며, 분석에 사용된 변인의 개수는 36개로 선정하였다. 세 번째 연구내용은 ‘혼합자료를 활용한 분류 머신러닝 알고리즘 기반 승패예측모형 탐색 및 비교’이다. 이를 위해 앞서 수집한 경기기록 자료와 배당률 자료를 모두 통합하였으며, 공통변인을 포함한 변인 83개를 활용하여 승패예측모형을 탐색하였다. 이 연구의 결론은 다음과 같다.

첫째, 원자료 경기기록과 차원축소 경기기록을 활용한 승패예측모형을 탐색한 결과 차원축소 경기기록 자료와 랜덤 포레스트 알고리즘을 함께 사용한 승패예측모형(RM2)이 분류정확도(Accuracy) 54.8%로 가장 높은 순위로 나타났다.

둘째, 원자료 배당률과 차원축소 배당률을 활용한 승패예측모형을 탐색한 결과 차원축소 배당률 자료와 랜덤 포레스트 알고리즘을 함께 사용한 승패예측모형(RM2)이 분류정확도 56.6%로 가장 높은 순위로 나타났다.

셋째, 원자료 혼합자료와 차원축소 혼합자료를 활용한 승패예측모형을 탐색한 결과 혼합자료와 랜덤 포레스트 알고리즘을 함께 사용한 승패예측모형(M2)이 분류정확도 57.8%로 가장 높은 순위로 나타났다.

결론적으로 이 연구에서는 잉글랜드 프리미어리그 경기기록과 배당률 자료를 활용하여 분류 머신러닝 알고리즘 기반 축구 승패예측모형을 탐색 및 비교하였다. 이는 축구 승패예측모형을 구축할 때 자료 형태에 따른 적절한 분류 머신러닝 알고리즘 선택의 기초정보로 활용될 수 있을 것으로 기대한다.

번역하기

이 연구의 목적은 잉글랜드 프리미어리그 경기기록과 배당률 자료를 활용하여 분류 머신러닝 알고리즘 기반 축구 승패예측모형을 탐색 및 비교하는 것이다. 이 연구의 목적을 달성하기 ...

다국어 초록 (Multilingual Abstract)

The purpose of this study is to use English Premier League game records and odds to explore and compare soccer Win/Loss prediction models based on classification machine learning algorithm. The research is divided into three parts of achieve its purpose. The first research is 'Discovery and Comparison of Classification Machine Learning-Based Win/Loss Prediction Models Using Competition Records.' For this purpose, 44 variables were selected for analysis from the game records of 3,800 games from the 09-10 to 18-19 seasons of the English Premier League which were collected from English Premier League official website (www.premierleague.com) and whoscored.com (whoscored.com). The previous competition average was used as the input variable to predict the results after testing and comparing for all previous 1-5 competitions to identify an effective combination in the analysis. The second research is 'Discovery and Comparison of Win/Loss Prediction Models Based on Classification Machine Learning Algorithms Using Odds.' For this purpose, 36 variables were selected from the odds for the games collected from football-data. The third research is 'Discovery and Comparison of Win/Loss Prediction Models Based on Classification Machine Learning Algorithms Using Mixed Data.' To this end, all the previously collected competition records and odds data were integrated, and analyzed 83 variables, including common variables that inform the results of the competition and the team name, to examine the optimal Win/Loss prediction model. The conclusions of this study are as follows. First, as a result of exploring and comparing the Win/Loss prediction models using the raw material competition records and the dimensional reduction competition records, the Win/Loss prediction model(RM2) using both the dimensional reduction competition record data and the random forest algorithm were ranked highest with accuracy 54.8%. Second, as a result of exploring and comparing Win/Loss prediction models using raw material odds and dimension reduction odds data, the Win/Loss prediction model(RM2) using both the dimension reduction dividend data and the random forest algorithm was ranked highest with accuracy 56.6%. Third, as a result of exploring and comparing the Win/Loss prediction models using the raw material mixed with the dimension reduction mixed data, the Win/Loss prediction model(M2) using the mixture and random forest algorithm was ranked highest with accuracy 57.8%.

In conclusion, the study used English Premier League game records and odds data to explore and compare the soccer Win/Loss prediction model based on the classification machine learning algorithm. It is expected that this study will be used as basic information for the use of classification machine learning algorithms according to data types when building a soccer Win/Loss prediction model by utilizing machine learning.

번역하기

목차 (Table of Contents)

목 차
I. 서 론 1
1. 연구의 필요성 1
2. 연구내용 5

목 차
I. 서 론 1
1. 연구의 필요성 1
2. 연구내용 5
Ⅱ. 이론적 배경 7
1. 분류(Classification)의 개념 7
2. 예측모형 탐색 7
3. 분류 머신러닝 알고리즘 10
Ⅲ. 연구방법 17
1. 연구절차 17
2. 연구자료 및 수집방법 18
3. 자료처리방법 21
1) 전처리과정 21
(1) 과거 경기에 따른 사전변인 22
(2) 결측값 처리 22
(3) 표준화 23
(4) 범주형 변인 변환 23
2) 교차검증(Cross Validation) 24
3) 모형평가 25
4) 차원축소(Dimensionality Reduction) 27
Ⅳ. 연구결과 29
1. 경기기록을 활용한 분류 머신러닝 알고리즘 기반 승패예측모형 탐색 및 비교 29
1) 경기기록을 활용한 승패예측모형 탐색 29
2) 차원축소 경기기록을 활용한 승패예측모형 탐색 35
3) 경기기록 승패예측모형과 차원축소 경기기록 승패예측모형 비교 40
2. 배당률을 활용한 분류 머신러닝 알고리즘 기반 승패예측모형 탐색 및 비교 41
1) 배당률을 활용한 승패예측모형 탐색 41
2) 차원축소 배당률을 활용한 승패예측모형 탐색 46
3) 배당률 승패예측모형과 차원축소 배당률 승패예측모형 비교 51
3. 혼합자료를 활용한 분류 머신러닝 알고리즘 기반 승패예측모형 탐색 및 비교 52
1) 혼합자료를 활용한 승패예측모형 탐색 52
2) 차원축소 혼합자료를 활용한 승패예측모형 탐색 57
3) 혼합자료 승패예측모형과 차원축소 혼합자료 승패예측모형 비교 62
Ⅴ. 논의 63
Ⅵ. 결론 67
참고문헌 69
ABSTRACT 76
부록 79

상세검색

RISS 보유자료

상세검색

해외전자자료

분류 머신러닝을 활용한 잉글랜드 프리미어리그 승패예측모형 탐색

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료