References of "Educational & Psychological Measurement"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailType I error inflation in DIF identification with Mantel-Haenszel: an explanation and a solution
Magis, David ULg; De Boeck, Paul

in Educational & Psychological Measurement (2014), 74

It is known that sum score-based methods for the identification of differential item functioning (DIF), such as the Mantel-Haenszel (MH) approach, can be affected by Type I error inflation in the absence ... [more ▼]

It is known that sum score-based methods for the identification of differential item functioning (DIF), such as the Mantel-Haenszel (MH) approach, can be affected by Type I error inflation in the absence of any DIF effect. This may happen when the items differ in discrimination and when there is item impact. On the other hand, outlier DIF methods have been developed that are robust against this Type I error inflation, while they are still based on the MH DIF statistic. The present paper gives an explanation for why the common MH method is indeed vulnerable to the inflation effect while the outlier DIF versions are not. In a simulation study we were able to produce the Type I error inflation by inducing item impact and item differences in discrimination. At the same time and in parallel with the Type I error inflation the dispersion of the DIF statistic across items was increased. As expected, the outlier DIF methods did not seem sensitive to impact and differences in item discrimination. [less ▲]

Detailed reference viewed: 12 (1 ULg)
Full Text
Peer Reviewed
See detailItem purification does not always improve DIF detection: a counter-example with Angoff’s Delta plot
Magis, David ULg; Facon, Bruno

in Educational & Psychological Measurement (2013), 73

Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score based DIF detection methods ... [more ▼]

Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores in order to get purified sets of items, unaffected by DIF. The purpose of this paper is to highlight that item purification is not always useful and that a single run of the DIF method may return equally suitable results. Angoff’s Delta plot is considered as a counter-example DIF method, with a recent improvement to the derivation of the classification threshold. Several possible item purification processes may be defined with this method, and all of them are compared through a simulation study and a real data set analysis. It appears that none of these purification processes clearly improves the Delta plot performance. A tentative explanation is drawn from the conceptual difference between the modified Delta plot and the other traditional DIF methods. [less ▲]

Detailed reference viewed: 16 (2 ULg)
Full Text
Peer Reviewed
See detailA robust outlier approach to prevent Type I error inflation in DIF
Magis, David ULg; De Boeck, Paul

in Educational & Psychological Measurement (2012), 72

The identification of differential item functioning (DIF) is often performed by means of statistical approaches that consider the raw scores as proxys for the ability trait level. One of the most popular ... [more ▼]

The identification of differential item functioning (DIF) is often performed by means of statistical approaches that consider the raw scores as proxys for the ability trait level. One of the most popular approaches, the Mantel-Haenszel (MH) method, belongs to this category. However, replacing the ability level by the simple raw score is a source of potential Type I error inflation, especially in the presence of DIF but also when DIF is absent and in the presence of impact. The purpose of this paper is to present an alternative statistical inference approach based on the same measure of DIF but such that the Type I error inflation is prevented. The key notion is that for DIF items, the measure has an outlying value which can be identified as such with inference tools from robust statistics. Although we use the MH log-odds ratio as a statistic, the inference is different. A simulation study is performed to compare the robust statistical inference with the classical inference method, both based on the MH statistic. As expected the Type I error rate inflation is avoided with the robust approach, while the power of the two methods is similar. [less ▲]

Detailed reference viewed: 14 (5 ULg)