RISS 검색 - 국내학술지논문 상세보기

다국어 초록 (Multilingual Abstract)

Automated essay scoring (AES) is defined as the scoring of written prose using computer technology. The objective of this meta-analysis is to consider the claim that machine scoring of writing test responses agrees with human raters as much as humans agree with other humans. The effect size is the agreement rate between AES and human scoring estimated using a random effects model. The exact agreement rate between AES and human scoring is 52%, compared with an exact agreement rate of 54% between humans. The adjacent agreement rate between AES and human scoring is 93%, compared to an adjacent agreement rate of 94% between humans. This meta-analysis shows that the agreement rate between AES and human raters is very comparable. This study also compares the subgroup analysis of agreement rates using study characteristic variables such as publication status, AES type, essay type, exam type, human expertise, country, and school level. Implications of the results and potential future research are discussed in the conclusion.

번역하기

국문 초록 (Abstract)

에세이 자동 채점은 컴퓨터 기술을 이용한 작문 채점이다. 사람 채점과 자동 채점 간의 일치도에 대한 타당화 연구가 많이 진행중이다. 아직까지 사람들 간의 채점의 일치도에 비해 부족하다는 비판도 있지만, 많은 연구들은 사람들 간의 채점만큼의 타당성을 주장하고 있다. 이 메타분석의 목표는 에세이 답안 채점에서 사람 채점자 간 일치도와 기계 채점과 사람 채점 간의 일치도를 비교해 보는 것이다.
이 메타분석에서 효과크기는 자동 채점 프로그램과 사람 채점자와의 일치율을 랜덤효과모형을 이용하여 추정하였다. 사람들 간 54%의 완전일치도와 비교하여 자동 채점 프로그램과 사람 채점자 간 완전일치도는 52%였다. 사람들 간 94%의 근접일치도와 비교하여 자동 채점 프로그램과 사람 채점자 간 근접일치도는 93%이다. 이 메타분석은 자동 채점 프로그램과 사람 채점자 간 일치도가 사람 채점자들 간의 일치도와 비교될 만큼 매우 높다는 것을 보여주었다. 이 연구는 또한 출판 여부, 자동 채점 프로그램 유형, 에세이 유형, 시험 유형, 전문가 채점 여무, 국가별, 학교 수준별과 같은 연구 특징 변수별 유목 간 일치도 차이를 비교하였다. 끝으로 이 연구를 통해 에세이 자동 채점 프로그램의 개발 및 적용의 시사점과 향후 연구 방향, 이 연구의 한계에 대해 논의 및 제시하였다.

번역하기

에세이 자동 채점은 컴퓨터 기술을 이용한 작문 채점이다. 사람 채점과 자동 채점 간의 일치도에 대한 타당화 연구가 많이 진행중이다. 아직까지 사람들 간의 채점의 일치도에 비해 부족하...

목차 (Table of Contents)

Ⅰ. Introduction
Ⅱ. Materials and methods
1. Literature search and inclusion criteria
2. Coding of studies
3. Computation of effect sizes

Ⅰ. Introduction
Ⅱ. Materials and methods
1. Literature search and inclusion criteria
2. Coding of studies
3. Computation of effect sizes
4. Combining effect sizes across studies
Ⅲ. Results
1. Description of effects
2. Overall analysis
3. Subgroup analysis
4. Meta-regression by publication year
Ⅳ. Discussion and Conclusion

참고문헌 (Reference)

1 Mazzeo, J., "The equivalence of scores from automated and conventional educational and psychological tests: A review of the literature" Educational Testing Service 1988

2 Page, E. B., "The computer moves into essay grading : updating the ancient test" 76 : 561-565, 1995

3 Bennett, R. E., "Technology and writing assessment: lessons learned from the US national assessment of educational progress" 2006

4 Hedges, L. V., "Statistical methods for meta-analysis" Academic Press 1985

5 Sirin, S. R., "Socioeconomic status and academic achievement : A meta-analytic review of research" 75 (75): 417-453, 2005

6 Kelly, P. A., "Review of the book automated essay scoring : A cross-disciplinary perspective" 30 (30): 66-68, 2006

7 Cooper, H. M., "Research synthesis and meta-analysis : A step-by-step approach" Sage 2010

8 Hamp-Lyons, L., "On second language writing" Lawrence Erlbaum Associates, Inc 2001

9 Chung, G. K., "Methodological approaches to online scoring of essays" 1997

10 Kim, J., "Meta-analysis of equivalence of computerized and P&P tests on ability measures" 1999