Enhancing diagnostic feedback in a K-12 language assessment: an exploration of diagnostic classification models for the reading domain

UNCG Author/Contributor (non-UNCG co-authors, if there are any, appear on document)
Meltem Yumsek (Creator)
The University of North Carolina at Greensboro (UNCG )
Web Site: http://library.uncg.edu/
Micheline Chalhoub-Deville

Abstract: Various stakeholders, such as educators, and policy makers, appeal for diagnostic feedback and actionable test results. Lack of diagnostic tests, or sufficient diagnosicity in reporting, has encouraged the use of diagnostic classification models (DCMs) for nondiagnostic tests. One particular context in which fine-grained test results are of utmost importance is English language proficiency (ELP) tests. ELP test results are used for critical decisions about English learners (ELs), such as classification, and placement in instructional programs. This study implemented the DCM methodology to the reading domain of a K-12 ELP test that was taken by 23,942 ELs in grades 6-8, and pursued the viability of DCMs for low-stakes, diagnostic feedback. The study adopted a comprehensive methodology and elaborate research design by incorporating alternative Q-matrices, various diagnostic models, validation strategies for the Q-matrix, and model selection. The results revealed that a Q-matrix created by experts was theoretically sound and more appropriate for diagnostic results from several aspects. Likewise, a saturated model, such as the log linear cognitive diagnostic model (LCDM), yielded a better fit at the test level. It was also deemed more suitable as the test items were either consistent with a compensatory or conjunctive model. The LCDM proved to be useful for exerting limited diagnostic information. Specifically, the mastery probabilities of individual attributes could be estimated accurately and consistently. Attributes could be separated to some extent, which supports the multidimensionality and makes the second language (L2) reading construct appropriate for the DCM analysis. Most items presented some diagnostic capacity, yet some items were more useful to differentiate masters and nonmasters. The ability estimation was generally consistent across the LCDM and IRT models. However, some results, such as the variability of attribute classes, reflected the unidimensional structure of the test. Overall, this study contributes to the representation of L2 reading construct and has some implications for teaching ELs and test development.

Additional Information

Language: English
Date: 2020
Diagnostic classification models, English Language Proficiency Tests, K-12 English Learners, Language testing
English language $x Study and teaching (Primary) $x Examinations
English language $x Study and teaching (Secondary) $x Examinations
English language $x Study and teaching (Primary) $x Ability testing
English language $x Study and teaching (Secondary) $x Ability testing

Email this document to