Cristin-resultat-ID: 1505138
Sist endret: 17. oktober 2017, 12:02
Vitenskapelig foredrag

Explanatory Item Response Modelling Of An Abstract Reasoning Assessment: A Case For Modern Test Design.

  • Fredrik Helland-Riise og
  • Johan Braeken


Navn på arrangementet: European Conference on Educational Research (ECER) 2016
Sted: Dublin
Dato fra: 22. august 2016
Dato til: 26. august 2016


Arrangørnavn: European Educational Research Association

Om resultatet

Vitenskapelig foredrag
Publiseringsår: 2016

Beskrivelse Beskrivelse


Explanatory Item Response Modelling Of An Abstract Reasoning Assessment: A Case For Modern Test Design.


General description Reasoning tests are popular components of the assessment toolbox for selection and admission into higher education and job employment (Leighton, 2004; Stanovich, Sá, & West, 2004). Abstract reasoning tests tap into a core reasoning ability (Carpenter, Just, & Shell, 1990) with tasks that require the examinee to generate and apply rules (Wüstenberg, Greiff, & Funke, 2012), but require neither explicit prior contents-specific knowledge of the examinee nor specific language skills (Raven, 2000). Traditionally, test construction and assembly have been the product of creative item writing processes and post-hoc psychometric evaluations, without explicit consideration of cognitive theory (Hunt, Frost, & Lunneborg, 1973). Yet, abstract reasoning provides a case that in principle is ideally suitable for modern test design (e.g. Embretson, 1998; Mislevy, Almond, & Lukas, 2003), combining cognitive theory with a more systematic approach to construction and assembly of test items. Objective and Research Questions. This study is part of a larger project aimed at reverse engineering an existing abstract reasoning test from a modern test design perspective to setup a virtual item bank that does not store individual items, but instead uses automatic item generation rules based on cognitive complexity (see e.g., Gierl & Haladyna, 2013). The objective of the current study represents one step towards such a virtual item bank with research questions focusing on (i) identifying the cognitive relevant item features (i.e. “radicals”) that impact the behaviour of the test and of the participants and (ii) identifying the merely “cosmetic” irrelevant item features (i.e., incidentals). The test. The abstract reasoning test is composed of testlets consisting of items related to the same problem situation from which a set of rules need to be derived that are necessary to solve the individual items. Each testlet is structured around a problem set consisting of a varying number of rows each consisting of a specified input stimulus configuration, an activated set of action buttons and a resulting output stimulus configuration. This problem set allows the examinee to derive the transformations that will happen to the input when a specific action button is activated. This rule knowledge is necessary to solve the connected items. An item consists of a single row with a specified input stimulus configuration, the activated set of action buttons for that item, and four alternative output stimulus configuration possibilities of which the examinee has to decide on the correct one. Theoretical framework. A rational task analysis of the abstract reasoning test proposes an artificial intelligent algorithm (see Newell & Simon, 1972) that consists of 4 core steps. (1) Inventorisation: all the characteristics of input stimulus configurations and output stimulus configurations of the problem set are registered; (2) Matching: an input/output dissimilarity matrix is computed; (3) Rule finding: computationally this would be similar to solving a system of equations or a more greedy version using elimination; (4) Rule application. The test has some characteristics built in by design that can be directly connected to the artificial intelligent algorithm and the related (i) cognitive load of the stimulus material and (ii) cognitive complexity of the rules that need to be derived. Examples of the former characteristics can be as simple as the number of symbols in the input stimulus configuration, examples of the latter characteristics can be whether or not the transformation caused by a specific action button can be derived on its own (i.e., independent of the other action buttons in the problem set). Some theoretically irrelevant item features can also be defined such as the type of symbols used in a stimulus configuration (e.g., triangle or circle).


Fredrik Helland-Riise

  • Tilknyttet:
    ved Educational Measurement ved Universitetet i Oslo

Johan Braeken

  • Tilknyttet:
    ved Educational Measurement ved Universitetet i Oslo
1 - 2 av 2