Item parameter changes and equating: an examination of the effects of lack of item parameter invariance on equating and score accuracy for different proficiency levels

UNCG Author/Contributor (non-UNCG co-authors, if there are any, appear on document)
Davie Store (Creator)
The University of North Carolina at Greensboro (UNCG )
Web Site:
Richard Luecht

Abstract: The impact of particular types of context effects on actual scores is less understood although there has been some research carried out regarding certain types of context effects under the nonequivalent anchor test (NEAT) design. In addition, the issue of the impact of item context effects on scores has not been investigated extensively when item response theory (IRT) is used to calibrate the items and maintain the score scale. The current study focuses on examining the impact of item parameter changes for anchor test items in a particular IRT equating context. The study specifically examines the impact of different types and magnitudes of item serial position changes as "context effects" on score accuracy and performance-related decisions (e.g., classifying examinees on pass/fail mastery tests or into three or more achievement levels). The study uses real data from a large-scale testing program to determine plausible levels of item difficulty changes as well as the magnitude of association between serial position changes and item difficulty changes. Those real-data results are then used to specify reasonable conditions of item difficulty changes in a large-scale, IRT-based computer simulation in order to investigate the comparability of different study conditions and Rasch equating methods in terms of adequacy to attaining successful equating within and across test designs. Results of the study indicate that when items change positions, they become either difficult or easier depending on the direction and magnitude of the change. Apparently, these changes in difficulty become very notable for low ability examinees in comparison to high ability examinees. Because high ability examinees are already more likely to get most items right, it is more unlikely to notice any changes due to changes in difficulty and /or context effects. To the contrary, with low ability examinees, there is a lot of room to investigate the impact the difficulty of an item has on an examinee; many low ability examinees are already missing many items and therefore decreasing or increasing the difficulty of an item enormously affects the probability of these examinees to respond to the item correctly. Further, examination of bias and root mean squared error statistics showed no differences among Rasch equating methods within testing conditions. However, for similar conditions that only differed in difficulty, results were different.

Additional Information

Language: English
Date: 2013
Context effects, Difficulty change, Invariance principle, Rasch Equating, Serial item position change, Testing
Education $x Research
Educational tests and measurements $x Evaluation

Email this document to