[en] This paper focuses on two likelihood-based indices of person fit, the index lz (Drasgow, Levine & Williams, 1985) and the Snijders’ modified index lz* (Snijders, 2001). The first one is commonly used in practical assessment of person fit, although its asymptotic standard normal distribution is not valid when true abilities are replaced by sample ability estimates. The lz* index is a generalization of lz which corrects for this sampling variability. Surprisingly, it is not yet popular in the psychometric and educational assessment community. Moreover, there is some ambiguity about which type of item response model and ability estimation method can be used to compute the lz* index. The purpose of this paper is to present the index lz* in a simple and didactic approach. Starting from the relationship between lz and lz*, we develop the framework according to the type of logistic IRT model and the likelihood-based estimators of ability. The practical calculation of lz* is illustrated by analyzing a real data set about language skill assessment.
Disciplines :
Mathematics
Author, co-author :
Magis, David ; Université de Liège - ULiège > Département de mathématique > Statistique mathématique
Raîche, Gilles
Béland, Sébastien
Language :
English
Title :
A didactic presentation of Snijders’ lz* index of person fit with emphasis on response model selection and ability estimation
Alternative titles :
[fr] Une présentation didactyique de l'indice de misfit lz* de Snijders avec emphase sur la sélection d'un modèle de réponse et l'estimation de l'habileté
Armstrong R. D.,Stoumbos Z. G.,Kung M. T.,Shi M.On the performance of the Lz person-fit statistics.Practical Assessment, Research and Evaluation. 2007;12:16.
Birnbaum A., Some latent trait models and their use in inferring an examinee's ability. In, Statistical theories of mental test scores. Lord F. M.Novick M. R., ed. Reading, MA: Addison-Wesley; 1968:395-479.
Bock R. D.,Mislevy R. J.Adaptive EAP estimation of ability in a microcomputer environment.Applied Psychological Measurement. 1982;6:431-444.
Cronbach L. J.Response set and test validity.Educational and Psychological Measurement. 1946;6:475-494.
de la Torre J.,Deng W.Improving person fit assessment by correcting the ability estimate and its reference distribution.Journal of Educational Measurement. 2008;45:159-177.
Dodeen H. The use of person-fit statistics to analyse placement tests. Paper presented at the Annual Meeting of the American Educational Research Association; 2003Chicago, IL; 2003.
Drasgow F.,Levine M. V.,McLaughlin M. E.Detecting inappropriate test scores with optimal and practical appropriateness indices.Applied Psychological Measurement. 1987;11:59-79.
Drasgow F.,Levine M. V.,Williams E. A.Appropriateness measurement with polychotomous item response models and standardized indices.British Journal of Mathematical and Statistical Psychology. 1985;38:67-86.
Emons W.Detection and diagnosis of person misfit from patterns of summed polytomous item scores.Applied Psychological Measurement. 2009;33:599-619.
Ferrando P. J.Person reliability in personality measurement: An item response theory analysis.Applied Psychological Measurement. 2004;28:126-140.
Glas C. A. W.,Dagohoy A. V. T.Person fit test for IRT models for polytomous items.Psychometrika. 2007;72:159-180.
Hoijtink H.,Boomsma A., On person parameter estimation in the dichotomous Rasch model. In, Rasch models. Foundations, recent developments, and applications. Fischer G. H.Molenaar I. W., ed. New York, NY: Springer Verlag; 1995:53-68.
Jeffreys H.Theory of probability. Oxford, UK: Oxford University Press; 1939:.
Jeffreys H.An invariant form for the prior probability in estimation problems.Proceedings of the Royal Society of London, Series A, Mathematical and Physical Sciences. 1946;186:453-461.
Karabatsos G.Comparing the aberrant response detection performance of thirty-six person-fit statistics.Applied Measurement in Education. 2003;16:277-298.
Klauer K. C., The assessment of person fit. In, Rasch models: Foundations, recent developments, and applications. Fischer G. H.Molenaar I. W., ed. New York, NY: Springer; 1995:97-110.
Laurier M. D.,Froio L.,Paero C.,Fournier M.L'élaboration d'un test provincial pour le classement des étudiants en anglais langue seconde au collégial [The elaboration of a provincial English, as a second language, test to classify college students]. Québec, Canada: Direction générale de l'enseignement collégial, ministère de l'Éducation du Québec; 1998:.
Levine M. V.,Drasgow F., Appropriateness measurement: Validating studies and variable ability models. In, New horizons in testing. Weiss D. J., ed. New York, NY: Academic Press; 1980:110-131.
Levine M. V.,Drasgow F.Appropriateness measurement: Review, critique, and validating studies.British Journal of Mathematical and Statistical Psychology. 1982;35:42-56.
Levine M. V.,Drasgow F.Optimal appropriateness measurement.Psychometrika. 1988;53:161-176.
Levine M. V.,Rubin D. B.Measuring the appropriateness of multiple-choice test scores.Journal of Educational Statistics. 1979;4:269-290.
Li M. F.,Olejnik S.The power of Rasch person-fit statistics in detecting unusual response patterns.Applied Psychological Measurement. 1997;21:215-231.
Masters G. N.A Rasch model for partial credit scoring.Psychometrika. 1982;47:149-174.
Meijer R.Consistency of test behaviour and individual difference in precision of prediction.Journal of Occupational and Organizational Psychology. 1998;71:147-160.
Meijer R.,Nering M. L.Trait level estimation for nonfitting response vectors.Applied Psychological Measurement. 1997;21:321-336.
Meijer R.,Sijtsma K.Methodology review: Evaluating person fit.Applied Psychological Measurement. 2001;25:107-135.
Molenaar I. W.,Hoijtink H.The many null distributions of person fit indices.Psychometrika. 1990;55:75-106.
Molenaar I. W.,Hoijtink H.Person-fit and the Rasch model, with an application to knowledge of logical quantors.Applied Measurement in Education. 1996;9:27-45.
Nering M. L.The distribution of person fit using true and estimated person parameters.Applied Psychological Measurement. 1995;19:121-129.
Nering M. L.The distribution of indexes of person-fit within the computerized adaptive testing environment.Applied Psychological Measurement. 1997;21:115-127.
Nering M. L.,Meijer R. R.A comparison of the person response function and the lz person-fit statistic.Applied Psychological Measurement. 1998;22:53-69.
Raîche G.Le dépistage de sous-classement aux tests de classement en anglais, langue seconde, au collégial [The detection of under-performance to college aptitude tests of English, as a second language]. Gatineau, Quebec, Canada: Collège de l'Outaouais; 2002:.
Raîche G.,Blais J.-G.Regards dur la modélisation de la mesure en éducation et en sciences sociales [Horizons in measurement modelling in education and social sciences]. Blais J.-G.Raîche G., ed. Ste-Foy, Québec, Canada: Presses de l'université Laval; 2003a:.
Raîche G.,Blais J.-G. The distribution of person-fit indices conditional on the estimated proficiency level and the detection of underachievement at a placement test. Paper presented at the International Meeting of the Psychometric Society; 2003bCagliari, Italy; 2003b.
Raîche G.,Blais J.-G. Characterization of the distribution of the Lz index of person fit according to the estimated proficiency level. Paper presented at the International Meeting of the Psychometric Society; 2005Tilburg, the Netherlands; 2005.
R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2009:.
Reise S. R.Scoring method and the detection of person misfit in a personality assessment context.Applied Psychological Measurement. 1995;19:213-229.
Rizopoulos D.ltm: An R package for latent variable modeling and item response theory Analyses.Journal of Statistical Software. 2006;17:1-25.
Ro S.Characteristics of a likelihood-based person-fit index under the graded response model. Minneapolis: University of Minnesota; 2001:.
Samejima F.Estimation of latent ability using a response pattern of graded scores.Psychometrika Monograph Supplement 17. 1969;:.
Sheather S. J.,Jones M. C.A reliable data-based bandwidth selection method for kernel density estimation.Journal of the Royal Statistical Society Series B. 1991;53:683-690.
Silverman B. W.Density estimation. London, England: Chapman and Hall; 1986:.
Smith R. M.Detecting measurement disturbances with the rash model. Chicago, IL: University of Chicago; 1982:.
Snijders T. A. B.Asymptotic null distribution of person fit statistics with estimated person parameter.Psychometrika. 2001;66:331-342.
Tatsuoka K. K.Caution indices based on item response theory.Psychometrika. 1984;49:95-110.
Tatsuoka K. K.Use of generalized person-fit indexes, zetas for statistical pattern classification.Applied Measurement in Education. 1996;9:65-76.
Tatsuoka K. K.,Linn R. L.Indices for detecting unusual patterns: Links between two general approaches and potential applications.Applied Psychological Measurement. 1983;7:81-96.
van Krimpen-Stoop E.,Meijer R.The null distribution of person-fit statistics for conventional and adaptive tests.Applied Psychological Measurement. 1999;23:327-344.
Warm T. A.Weighted likelihood estimation of ability in item response models.Psychometrika. 1989;54:427-450.
Wright B. D.,Masters G. N.Rating scale analysis. Chicago, IL: MESA Press; 1982:.
Wright B. D.,Stone M. H.Best test design. Rasch measurement. Chicago, IL: Mesa Press; 1979:.
Zickar M. J.,Drasgow F.Detecting faking on a personality instrument using appropriateness measurement.Applied Psychological Measurement. 1996;20:71-87.