Luecht, Richard

uncg

There are 20 item/s.

Title	Date	Views	Brief Description
Modeling differential pacing trajectories in high stakes computer adaptive testing using hierarchical linear modeling and structural equation modeling	2006	2178	"This study compares two statistical methods for modeling changes in response latency (timing) patterns on a high-stakes adaptive test: (1) hierarchical linear modeling (HLM2) and (2) growth modeling using structural equation modeling (SEM). The test...
A comparison of traditional test blueprinting and item development to assessment engineering in a licensure context	2010	3774	With the need for larger and larger banks of items to support adaptive testing and to meet security concerns, large-scale item generation is a requirement for many certification and licensure programs. As part of the mass production of items, it i...
Detecting test cheating using a Deterministic, gated item response theory model	2010	7511	High-stakes tests are widely used as measurement tools to make inferences about test takers' proficiency, achievement, competence or knowledge. The stakes may be directly related to test performance, such as obtaining a high-school diploma, being gra...
Conditions affecting the accuracy of classical equating methods for small samples under the NEAT design: a simulation study	2011	5284	Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under va...
Item parameter changes and equating: an examination of the effects of lack of item parameter invariance on equating and score accuracy for different proficiency levels	2013	6330	The impact of particular types of context effects on actual scores is less understood although there has been some research carried out regarding certain types of context effects under the nonequivalent anchor test (NEAT) design. In addition, the iss...
The optimal design of the dual-purpose test	2013	2136	Traditional test development focused on one purpose of the test, either ranking test-takers or providing diagnostic profiles for test-takers. Embedding both the ranking and diagnostic purposes in one assessment instrument would be a great advancement...
An investigation on computer-adaptive multistage testing panels for multidimensional assessment	2013	3573	The computer-adaptive multistage testing (ca-MST) has been developed as an alternative to computerized adaptive testing (CAT), and been increasingly adopted in large-scale assessments. Current research and practice only focus on ca-MST panels for cre...
An Examination of the Residual Covariance Structures of Complex Performance Exercises Under Various Scaling and Scoring Methods	2008	4870	Large-scale assessment programs are increasingly including complex performance exercises along with traditional multiple-choice items in a given test. These performance assessments are developed in part to measure sets of skills that are part of the ...
Relationships between examinee pacing and observed item responses: results from a multi-factor simulation study and an operational high stakes assessment	2009	2233	The use of response time in testing has a relatively long history, ranging from concerns over test speededness to using response times as performance indicators (e.g., speed and accuracy). This model-based investigation examined the relationship betw...
Principled assessment as a foundation for standard setting	2015	1926	This study investigated the impact of using Assessment Engineering (AE) task models as the unit of judgment in a standard setting workshop. The proposed method, or Task Model-based Standard Setting (TMSS), used a procedure similar to that of the Book...
A simulation study to investigate optimal equating anchor set construction practices under the NEAT design	2018	997	This study examines anchor set construction techniques in observed score test equating under the non-equivalent with anchor-test design. It differs from other studies in that it seeks to understand the interaction between the examinee abilities, test...
Data collection design for equivalent groups equating:using a matrix stratification framework for mixed-format assessment	2012	3718	Mixed-format assessments are increasingly being used in large scale standardized assessments to measure a continuum of skills ranging from basic recall to higher order thinking skills. These assessments are usually comprised of a combination of (a) m...
Enemy item detection using data mining methods	2019	1961	Enemy items are any two items that should not appear on the same test form. These items may address the same material, or one may provide clues about the answer to another. Most enemy item pairs are identified before forms are published; subject matt...
Quality control and the impact of variation and prediction errors on item family design	2024	97	This two-part study examined the impact of variation within item families and errors associated with predicted item difficulty parameters on examinee test scores. Part A served as an extension of Shu et al.’s (2010) study to address how much variatio...
The effects of routing and scoring within a computer adaptive multi-stage framework	2014	1925	This dissertation examined the overall effects of routing and scoring within a computer adaptive multi-stage framework (ca-MST). Testing in a ca-MST environment has become extremely popular in the testing industry. Testing companies enjoy its efficie...
A comparison of observed score approaches to detecting differential item functioning among multiple groups	2018	1006	The overall purpose of this dissertation was to compare various observed score approaches in detecting differential item functioning among multiple examinee groups simultaneously. Specifically, this study contributes to the literature base by investi...
Optimal characteristics of anchor tests in vertical scaling: a special case of non equivalent groups with anchor test (NEAT) design in vertical scaling	2019	713	There are multiple empirical issues and complications associated with vertical scaling methods that have not been sufficiently explicated even though there has been scanty research conducted within the general framework of the nonequivalent group wit...
A reconceptualization of IRT calibration with DIF items in a PROMIS Fatigue measure	2022	54	Differential item functioning (DIF) is a statistical procedure intended for examining and evaluating test fairness. After DIF items are detected, there are three methods to deal with DIF items, which are to ignore DIF items, remove DIF items, and cre...
Operationalizing item difficulty modeling in a medical certification context	2020	688	This research study modeled item difficulty in general pediatric test items using content, cognitive complexity, linguistic, and text-based variables. The research first presents an introduction which addresses the current shortcomings found in item ...
Scoring methods of innovative items	2021	210	Advancements in technology and computer-based testing has allowed for greater flexibility in assessing examinee knowledge on large-scale, high-stakes assessments. Through computer-based delivery, cognitive ability and skills can be effectively assess...

Browse All

Theses & Dissertations

Submissions

Luecht, Richard

uncg