|An Examination of the Residual Covariance Structures of Complex Performance Exercises Under Various Scaling and Scoring Methods
||Large-scale assessment programs are increasingly including complex performance exercises along with traditional multiple-choice items in a given test. These performance assessments are developed in part to measure sets of skills that are part of the ...
|Modeling differential pacing trajectories in high stakes computer adaptive testing using hierarchical linear modeling and structural equation modeling
||"This study compares two statistical methods for modeling changes in response latency (timing) patterns on a high-stakes adaptive test: (1) hierarchical linear modeling (HLM2) and (2) growth modeling using structural equation modeling (SEM). The test...
|A comparison of traditional test blueprinting and item development to assessment engineering in a licensure context
||With the need for larger and larger banks of items to support adaptive
testing and to meet security concerns, large-scale item generation is a
requirement for many certification and licensure programs. As part of the mass
production of items, it i...
|Conditions affecting the accuracy of classical equating methods for small samples under the NEAT design: a simulation study
||Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under va...
|Data collection design for equivalent groups equating:using a matrix stratification framework for mixed-format assessment
||Mixed-format assessments are increasingly being used in large scale standardized assessments to measure a continuum of skills ranging from basic recall to higher order thinking skills. These assessments are usually comprised of a combination of (a) m...
|Item parameter changes and equating: an examination of the effects of lack of item parameter invariance on equating and score accuracy for different proficiency levels
||The impact of particular types of context effects on actual scores is less understood although there has been some research carried out regarding certain types of context effects under the nonequivalent anchor test (NEAT) design. In addition, the iss...
|The optimal design of the dual-purpose test
||Traditional test development focused on one purpose of the test, either ranking test-takers or providing diagnostic profiles for test-takers. Embedding both the ranking and diagnostic purposes in one assessment instrument would be a great advancement...
|An investigation on computer-adaptive multistage testing panels for multidimensional assessment
||The computer-adaptive multistage testing (ca-MST) has been developed as an alternative to computerized adaptive testing (CAT), and been increasingly adopted in large-scale assessments. Current research and practice only focus on ca-MST panels for cre...
|The effects of routing and scoring within a computer adaptive multi-stage framework
||This dissertation examined the overall effects of routing and scoring within a computer adaptive multi-stage framework (ca-MST). Testing in a ca-MST environment has become extremely popular in the testing industry. Testing companies enjoy its efficie...
|Principled assessment as a foundation for standard setting
||This study investigated the impact of using Assessment Engineering (AE) task models as the unit of judgment in a standard setting workshop. The proposed method, or Task Model-based Standard Setting (TMSS), used a procedure similar to that of the Book...