An Examination of the Residual Covariance Structures of Complex Performance Exercises Under Various Scaling and Scoring Methods

UNCG Author/Contributor (non-UNCG co-authors, if there are any, appear on document)
Joshua Goodman (Creator)
Institution
The University of North Carolina at Greensboro (UNCG )
Web Site: http://library.uncg.edu/
Advisor
Richard Luecht

Abstract: Large-scale assessment programs are increasingly including complex performance exercises along with traditional multiple-choice items in a given test. These performance assessments are developed in part to measure sets of skills that are part of the trait to be measured, but are not easily assessed with multiple choice items. One approach to creating and scoring these items is to create a set of tasks within a scenario that can be objectively scored using a set of scoring rules to yield dichotomous responses. Including complex performance items introduces two potential challenges: first, the performance items are developed to measure something distinctly different and may introduce some degree of multidimensionality into the test; second, as the set of measurement opportunities stem from a common stimuli and are scored with a set of elaborate rules, contextual and scoring dependencies are likely to arise. Both multidimensionality and statistical dependencies may create a situation where non-zero residual covariances are present. This study uses a computer simulation to create different amounts of association among the CPE item due to the three sources mentioned above. The magnitude and distribution of the residual covariances are assessed under two different methods for scoring the simulations (dichotomous or polytomous scoring) and under different Item Response Theory based scaling methods (creating separate scales for the two item types or joint calibrations of all items). The results indicate the following: If only contextual/scoring dependencies are present in the data, polytomous scoring is effective in eliminating some of the extreme dependencies due to scoring factors, but does not decrease the average amount of residual covariance among the measurement opportunities of the performance items. Treating performance exercises and selected response items as two separate and distinct scales was effective in controlling the amount of residual covariance regardless of the underlying dimensional structure. However, when the correlation between traits was moderate to high, the joint calibration approaches show similar amounts of residual covariance among performance exercises as the separate scale approach, and produce score estimates that are more precise. Last, when dependencies are the result of all the sources mentioned above, only the separate scales approach couple with the polytomous scoring approach is successful in reducing the residual covariance to zero levels. Choosing a joint scaling approach and polytomously scored items when the data is two-dimensional, even when context or scoring dependencies are present, leads to large amounts of residual covariance.

Additional Information

Publication
Dissertation
Language: English
Date: 2008
Keywords
Residual Covariance Structures, Performance Exercises, Scaling, Scoring, Methods, assessment programs, computer simulation, residual covariances, dichotomous, polytomous scoring
Subjects
Statistics and Probability.
Education $x Research $x Methodology.
Educational tests and measurements $x Computer simulation.
Academic achievement $x Testing.
Education $x Evaluation.
Statistics and Probability $x Education.

Email this document to