Journal of Educational Measurement

Psychometric and cognitive functioning of an under-determined computer-based response type for quantitative reasoning

View publication


We evaluated a computer-delivered response type for measuring quantitative skill. "Generating Examples" (GE) presents under-determined problems that can have many right answers. We administered two GE tests that differed in the manipulation of specific item features hypothesized to affect difficulty. Analyses related to internal consistency reliability, external relations, and features contributing to item difficulty, adverse impact, and examinee perceptions. Results showed that GE scores were reasonably reliable but only moderately related to the GRE quantitative section, suggesting the two tests might be tapping somewhat different skills. Item features that increased difficulty included asking examinees to supply more than one correct answer and to identify whether an item was solvable. Gender differences were similar to those found on the GRE quantitative and analytical test sections. Finally, examinees were divided on whether GE items were a fairer indicator of ability than multiple-choice items, but still overwhelmingly preferred to take the more conventional questions.