In order to study the trends in educational progress and to measure the changes in performance, existing assessment instrument, administration procedures, and analysis techniques had to be retained over time. Such instruments as the NAEP in the US and...
In order to study the trends in educational progress and to measure the changes in performance, existing assessment instrument, administration procedures, and analysis techniques had to be retained over time. Such instruments as the NAEP in the US and the NAEA in Korea use the IRT to analyze data and to calibrate items by a subscale. During the process of estimating item parameters, checks are performed to make sure that the IRT models fit the data. However, there is no clear criterion to assist psychometric practitioners at treating items in scaling for national assessment of students' achievement, such as the NAEP and the NAEA. The objective of the current study is to investigate the impact of item treatments, specifically splitting trend items by assessment years, in scaling on the NAEP trend reporting scores. Specifically, this study investigates how item treatments, in scaling, change other item estimates in US history tests. By splitting items, the proficiency level of some students had little impact on US history subscales. In addition, this study examines how splitting items by assessment years affect reporting trend scores.