Impact of item parameter drift on IRT linking methods

UNCG Author/Contributor (non-UNCG co-authors, if there are any, appear on document)
David Frederick Chen (Creator)
The University of North Carolina at Greensboro (UNCG )
Web Site:
Kyung Kim

Abstract: Item parameter drift is a severe threat to testing programs that need to ensure fair and comparable scores between different forms of the same test. This study examines the effect of drift on simulated and empirical data sets using the following five IRT linking methods: Stocking-Lord, Haebara, least absolute values, concurrent calibration, and fixed parameter calibration. Four factors were varied: the proportion of drifted items, the magnitude of drifted items, examinee ability distributions, and sample size. The least absolute values method was best at recovering linking constant B, difficulty estimates, and equated true and observed scores. Concurrent calibration and fixed parameter calibration most accurately recovered linking constant A and discrimination estimates. All linking methods provided similar classification accuracy and consistency rates. However, the profound impact of drift has the potential to affect equated scores even at lower magnitudes of drift because of its impact on the linking constants and item parameter estimates that precede equating. Practitioners should remove drifted items when possible and investigate the reason for drift to prevent future reoccurrences. Recommendations for identifying reasons for drift and accumulating evidence for validation when confronted with drift are discussed.

Additional Information

Language: English
Date: 2021
Item Parameter Drift, Linking
Educational tests and measurements
Item response theory
Parameter estimation

Email this document to