References of "Magis, David"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailThe evaluation of vocal accuracy: The case of operatic singing voices
Larrouy, Pauline ULg; Magis, David ULg; Morsomme, Dominique ULg

in Music Perception (in press)

The objective analysis of Western operatic singing voices indicates that professional singers can be particularly “out of tune”. This study aims to better understand the evaluation of operatic voices ... [more ▼]

The objective analysis of Western operatic singing voices indicates that professional singers can be particularly “out of tune”. This study aims to better understand the evaluation of operatic voices, which have particularly complex acoustical signals. Twenty-two music experts were asked to evaluate the vocal pitch accuracy of 14 sung performances with a pairwise comparison paradigm, in a test and a retest. In addition to the objective measurement of pitch accuracy (pitch interval deviation), several performance parameters (average tempo, fundamental frequency of the starting note) and quality parameters (energy distribution, vibrato rate and extent) were observed and compared to the judges’ perceptual rating. The results show high intra- and inter-judge reliability when rating the pitch accuracy of operatic singing voices. Surprisingly, all the parameters were significantly related to the ratings and explain 78.8% of the variability of the judges’ rating. The pitch accuracy evaluation of operatic voices is thus not based exclusively on the precision of performed music intervals but on a complex combination of performance and quality parameters. [less ▲]

Detailed reference viewed: 16 (5 ULg)
Full Text
Peer Reviewed
See detailType I error inflation in DIF identification with Mantel-Haenszel: an explanation and a solution
Magis, David ULg; De Boeck, Paul

in Educational & Psychological Measurement (in press)

It is known that sum score-based methods for the identification of differential item functioning (DIF), such as the Mantel-Haenszel (MH) approach, can be affected by Type I error inflation in the absence ... [more ▼]

It is known that sum score-based methods for the identification of differential item functioning (DIF), such as the Mantel-Haenszel (MH) approach, can be affected by Type I error inflation in the absence of any DIF effect. This may happen when the items differ in discrimination and when there is item impact. On the other hand, outlier DIF methods have been developed that are robust against this Type I error inflation, while they are still based on the MH DIF statistic. The present paper gives an explanation for why the common MH method is indeed vulnerable to the inflation effect while the outlier DIF versions are not. In a simulation study we were able to produce the Type I error inflation by inducing item impact and item differences in discrimination. At the same time and in parallel with the Type I error inflation the dispersion of the DIF statistic across items was increased. As expected, the outlier DIF methods did not seem sensitive to impact and differences in item discrimination. [less ▲]

Detailed reference viewed: 12 (1 ULg)
Full Text
Peer Reviewed
See detailOn the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models
Magis, David ULg

in British Journal of Mathematical & Statistical Psychology (in press)

In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods ... [more ▼]

In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods were introduced to lessen the impact of such aberrant responses onto the estimation process. The computation of asymptotic (i.e., large sample) standard errors (ASE) for these robust estimators, however, has not been fully considered yet. This paper focuses on a broad class of robust ability estimators, defined by an appropriate selection of the weight function and the residual measure, for which the ASE is derived from the theory of estimating equations. The maximum likelihood (ML) and the robust estimators, together with their estimated ASE, are then compared through a simulation study. It is concluded that both the estimators and their ASE perform similarly in absence of response disturbances, while the robust estimator and its estimated ASE are less biased and outperform their ML counterparts in presence of response disturbances with large impact on the item response process. [less ▲]

Detailed reference viewed: 16 (2 ULg)
Full Text
Peer Reviewed
See detailA note on weighted likelihood and Jeffreys modal estimation of proficiency levels in polytomous item response models
Magis, David ULg

in Psychometrika (in press)

Warm (1989) established the equivalence between the so-called Jeffreys modal and the weighted likelihood estimators of proficiency level with some dichotomous item response models. The purpose of this ... [more ▼]

Warm (1989) established the equivalence between the so-called Jeffreys modal and the weighted likelihood estimators of proficiency level with some dichotomous item response models. The purpose of this note is to extend this result to polytomous item response models. First, a general condition is derived to ensure the perfect equivalence between these two estimators. Second, it is shown that this condition is fulfilled by two broad classes of polytomous models including, among others, the partial credit, rating scale, graded response and nominal response models. [less ▲]

Detailed reference viewed: 23 (4 ULg)
Full Text
Peer Reviewed
See detailThe effect of melody and technique on the singing voice accuracy of trained singers
Larrouy, Pauline ULg; Magis, David ULg; Morsomme, Dominique ULg

in Logopedics, Phoniatrics, Vocology (in press)

A previous study highlighted the effect of vocal technique on the singing voice accuracy of trained singers (1). The intervals' precision between the notes of the tune was altered when the singers used ... [more ▼]

A previous study highlighted the effect of vocal technique on the singing voice accuracy of trained singers (1). The intervals' precision between the notes of the tune was altered when the singers used Western operatic singing technique. In order to better understand these results, we have recorded two different melodies sung with two different vocal techniques. A large panel of trained singers (N = 50) participated in the study. The analytical method described in the reference paper (1) has been applied. The results confirm the effect of vocal technique on the vocal accuracy of trained singers. In addition, these results provide an answer about the melodic effect and guide future work on the perception process of operatic voices. [less ▲]

Detailed reference viewed: 43 (14 ULg)
Full Text
Peer Reviewed
See detailPassage de l’administration fixe d’un test à une administration adaptative : application au TCALS-II
Magis, David ULg; Raîche, Gilles

in Raîche, Gilles; Ndinga, Pascal; Meunier, Hélène (Eds.) L'interdisciplinarité de la mesure et de l’évaluation (in press)

La problématique du passage de l’administration fixe (de type papier-crayon) à une administration adaptative d’un test est étudiée. Une méthode en deux étapes est présentée. Dans un premier temps, des ... [more ▼]

La problématique du passage de l’administration fixe (de type papier-crayon) à une administration adaptative d’un test est étudiée. Une méthode en deux étapes est présentée. Dans un premier temps, des patrons de réponses sont générés selon une administration fixe, dans le but de déterminer des valeurs admissibles de l’erreur-type d’estimation du niveau d’habileté. Ensuite, ces valeurs sont utilisées comme critères d’arrêt lors d’une administration adaptative du même test. La longueur du test est alors considérée pour évaluer la qualité du test par rapport à sa version fixe. Le test de classement en anglais, langue seconde, au collégial (TCALS-II) est utilisé en guise d’illustration. Il est établi qu’une administration adaptative du TCALS-II permettrait de réduire sensiblement la longueur du test, sans perte de qualité de l’estimation des niveaux d’habileté. Toutefois, cette amélioration est limitée aux sujets ne présentant pas un niveau d’habileté trop faible ou trop important. [less ▲]

Detailed reference viewed: 23 (5 ULg)
Peer Reviewed
See detailDetection of differential item functioning using the lasso approach
Magis, David ULg; Tuerlinckx, Francis; De Boeck, Paul

Conference (2014, July 22)

The purpose of this talk is to present a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis ... [more ▼]

The purpose of this talk is to present a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we consider a logistic regression model including item-group interaction (i.e. DIF) effects of all items simultaneously. The method is based on penalized maximum likelihood estimation of a model with a lasso penalty on all possible DIF parameters. Optimal penalty parameter selection is investigated through several known information criteria (such as AIC and BIC) as well as a newly developed weighted alternative. A simulation study was conducted to compare the global performance of the suggested “lasso DIF” method to the logistic regression and Mantel-Haenszel methods, and to evaluate the different optimal penalty parameter selection methods. It is concluded that for small samples the lasso DIF approach globally outperforms the logistic regression method, and also the Mantel-Haenszel method, especially in the presence of item impact, while it yields similar results with larger samples. [less ▲]

Detailed reference viewed: 13 (0 ULg)
See detailWhat’s beyond Concerto: an introduction to the R package catR
Magis, David ULg

Scientific conference (2014, June 10)

Detailed reference viewed: 10 (0 ULg)
Full Text
See detailOpen-source CAT software: R packages and Concerto
Magis, David ULg

Conference (2014, March 12)

Together with the investigation of new or updated CAT procedures, it is of primary importance to ensure development of appropriate, flexible and useful CAT software. Open-source CAT algorithms have been ... [more ▼]

Together with the investigation of new or updated CAT procedures, it is of primary importance to ensure development of appropriate, flexible and useful CAT software. Open-source CAT algorithms have been recently proposed and are offering very promising tools for future practical CAT implementations, though yet under development. After a brief overview of available (commercial) software, I will present and compare the characteristics of some open-source R packages as CAT solutions: catR (Magis, & Raîche, 2012), catIrt (Nydick, 2013) and MAT (Choi, 2011), as well as the R-based software Firestar (Choi, 2009). A more complete description of catR will be given and (depending on time and computer constraints) a short illustrative session will be proposed. Finally, the web platform Concerto (Kosinski & Rust, 2011) will be shortly introduced. References: Choi, S. W. (2009). Firestar: Computerized adaptive testing simulation program for polytomous item response theory models. Applied Psychological Measurement, 33, 644-645. Choi, S. W. (2011). MAT: Multidimensional Adaptive Testing (MAT). R package version 0.1-3. Kosinski, M., & Rust, J. (2011). The development of Concerto: An open source online adaptive testing platform. Paper presented at the International Association for Computerized Adaptive Testing, Pacific Grove, CA. Magis, D., & Raîche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48 (8), 1-31. Nydick, S. W. (2013). catIrt: An R package for simulating IRT-based computerized adaptive tests. R package version 0.4-1. [less ▲]

Detailed reference viewed: 106 (2 ULg)
Full Text
See detailAn efficient standard error formula for the weighted likelihood estimator of ability
Magis, David ULg

Scientific conference (2014, March 04)

The weighted likelihood estimator (WLE; Warm, 1989) has become a very popular ability estimator in item response theory. It was developed to basically cancel the estimation bias of the maximum likelihood ... [more ▼]

The weighted likelihood estimator (WLE; Warm, 1989) has become a very popular ability estimator in item response theory. It was developed to basically cancel the estimation bias of the maximum likelihood estimator (MLE) with short tests. However, some uncertainty remains about its standard error formula. Warm (1989) established the asymptotic equivalence of the standard errors of both MLE and WLE, but there exists actually various practical formulas for the latter (Magis & Raîche, 2012; Nydick, 2013; Partchev, 2012; Warm, 2007), leading obviously to confusion. The purpose of this talk is to briefly sketch a general approach to derive a first-order approximation of the standard error of the WLE. Based on asymptotic adjustments of the law of large numbers and central limit theorem, a simple formula is derived. The efficiency of the latter is compared to aforementioned formulas by means of a simulation study under Rasch modeling. Preliminary results indicate that the derived formula has lower bias and RMSE than any other competing formula, especially with short tests. All formulas behave similarly with longer tests, as expected. [less ▲]

Detailed reference viewed: 10 (1 ULg)
Full Text
Peer Reviewed
See detailRecent advances and improvements in computerized adaptive testing with the R package catR
Magis, David ULg; Barrada, Juan

Conference (2014, February 14)

The purpose of this talk is to present recent advances and developments of the R package catR. This package allows for random generation of response patterns under computerized adaptive testing (CAT) and ... [more ▼]

The purpose of this talk is to present recent advances and developments of the R package catR. This package allows for random generation of response patterns under computerized adaptive testing (CAT) and holds various options to select the first items, estimate ability, select the next item, stop the test and return final results. Two main improvements were realized. First, several rules for next item selection were added, among others, Kullback- Leibler, progressive and proportional methods. Second, catR was limited to dichotomous IRT models so far. The most recent update allows now for several polytomous IRT models, such as partial and generalized partial credit model, graded and modified graded response models, rating scale model, and nominal response model. All improvements will be shortly presented, both from a theoretical point of view and in terms of practical implementation in catR. If possible, several illustrative examples will be displayed in a short live demonstration of catR. [less ▲]

Detailed reference viewed: 26 (2 ULg)
Full Text
Peer Reviewed
See detailEffects of melody and technique on acoustical and musical features of Western operatic singing voices
Larrouy, Pauline ULg; Magis, David ULg; Morsomme, Dominique ULg

in Journal of Voice (2014)

Objective: The operatic singing technique is frequently employed in classical music. Several acoustical parameters of this specific technique have been studied but how these parameters combine remains ... [more ▼]

Objective: The operatic singing technique is frequently employed in classical music. Several acoustical parameters of this specific technique have been studied but how these parameters combine remains unclear. This study aims to further characterize the Western operatic singing technique by observing the effects of melody and technique on acoustical and musical parameters of the singing voice. Methods: Fifty professional singers performed two contrasting melodies (popular song and romantic melody) with two vocal techniques (with and without operatic singing technique). The common quality parameters (energy distribution, vibrato rate and extent), perturbation parameters (standard deviation of the fundamental frequency, signal-to-noise ratio, jitter and shimmer) and musical features (fundamental frequency of the starting note, average tempo, and sound pressure level) of the 200 sung performances were analyzed. Results: The results regarding the effect of melody and technique on the acoustical and musical parameters show that the choice of melody had a limited impact on the parameters observed, whereas a particular vocal profile appeared depending on the vocal technique employed. Conclusions: This study confirms that vocal technique affects most of the parameters examined. In addition, the observation of quality, perturbation and musical parameters contributes to a better understanding of the Western operatic singing technique. [less ▲]

Detailed reference viewed: 55 (3 ULg)
Full Text
Peer Reviewed
See detailSnijders’ correction of Infit and Outfit indexes with estimated ability level : an analysis with the Rasch model
Magis, David ULg; Béland, Sébastien; Raîche, Gilles

in Journal of Applied Measurement (2014), 15

The Infit mean square W and the Outfit mean square U are commonly used person fit indexes under Rasch measurement. However, they suffer from two major weaknesses. First, their asymptotic distribution is ... [more ▼]

The Infit mean square W and the Outfit mean square U are commonly used person fit indexes under Rasch measurement. However, they suffer from two major weaknesses. First, their asymptotic distribution is usually derived by assuming that the true ability levels are known. Second, such distributions are even not clearly stated for indexes U and W. Both issues can seriously affect the selection of an appropriate cut-score for person fit identification. Snijders (2001) proposed a general approach to correct some person fit indexes when specific ability estimators are used. The purpose of this paper is to adapt this approach to U and W indexes. First, a brief sketch of the methodology and its application to U and W is proposed. Then, the corrected indexes are compared to their classical versions through a simulation study. The suggested correction yields controlled Type I errors against both conservatism and inflation, while the power to detect specific misfitting response patterns gets significantly increased. [less ▲]

Detailed reference viewed: 25 (0 ULg)
Full Text
Peer Reviewed
See detailAccuracy of asymptotic standard errors of the maximum and weighted likelihood estimators of proficiency levels with short tests
Magis, David ULg

in Applied Psychological Measurement (2014), 38

The maximum likelihood (ML) and the weighted likelihood (WL) estimators are commonly used to obtain proficiency level estimates with pre-calibrated item parameters. Both estimators have the same ... [more ▼]

The maximum likelihood (ML) and the weighted likelihood (WL) estimators are commonly used to obtain proficiency level estimates with pre-calibrated item parameters. Both estimators have the same asymptotic standard error (ASE) that can be easily derived from the expected information function of the test. However, the accuracy of this asymptotic formula is uncertain with short tests when only a few items are administered. The purpose of this paper is to compare the ASE of these estimators to their exact values, evaluated at the proficiency level estimates. The exact SE is computed by generating the full exact sample distribution of the estimators, so its practical feasibility is limited to small tests (except under the Rasch model). A simulation study was conducted to compare the ASE and the exact SE of the ML and WL estimators, to the “true” SE (i.e., computed as the exact SE with the true proficiency levels). It is concluded that with small tests, the exact SEs are less biased and return smaller root mean squared error values than the asymptotic SEs, while as expected the two estimators return similar results with longer tests. [less ▲]

Detailed reference viewed: 21 (5 ULg)
Full Text
See detailUsing R for conducting psychometric research
Magis, David ULg

Scientific conference (2013, October 31)

R is a statistical environment that permits to perform many various analyses, from basic descriptive statistics to advanced and complex modeling through high-level graphical features. In addition, it is ... [more ▼]

R is a statistical environment that permits to perform many various analyses, from basic descriptive statistics to advanced and complex modeling through high-level graphical features. In addition, it is open-source software with thousands of users worldwide and with (almost) daily improvements by the development of additional packages. The purpose of this talk is twofold: (a) to provide a broad overview of R (in its basic form), RStudio (software for optimized display of R components) and R Commander (an add-on package with menus and toolboxes for R); and (b) to describe and illustrate the functioning of several R packages related to psychometrics in general and item response theory in particular. The talk will be at a theoretical low-level, focusing rather on applications and illustrations with examples. Topics to be covered may include (depending on time constraints): calibration of item response models, ability estimation, model checking, differential item functioning, computerized adaptive testing. Some information about the creation and compilation of R packages will also be given. References: R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. [less ▲]

Detailed reference viewed: 40 (4 ULg)
Full Text
See detailAsymptotic distribution of robust estimators of ability and applications
Magis, David ULg

Scientific conference (2013, October 31)

In item response theory (IRT), ability estimation can be seriously affected by abnormal responses occurring from e.g., cheating, inattention, lack of time, guessing, tiredness, stress… Those phenomena may ... [more ▼]

In item response theory (IRT), ability estimation can be seriously affected by abnormal responses occurring from e.g., cheating, inattention, lack of time, guessing, tiredness, stress… Those phenomena may influence the ability estimation process tremendously. One the one hand, person fit indices were developed as post-hoc approaches to identify abnormal responses patterns as a whole (e.g., Meijer & Sijtsma, 2001). On the other hand, getting uncontaminated ability estimates would also be a challenging issue. Robust estimators were proposed in the IRT framework to lessen the impact of abnormal responses onto the estimation process (Mislevy & Bock, 1982; Schuster & Yuan, 2011; Wainer & Wright, 1980). Yet, these estimators are still rarely used in practice, mostly because very little is known about their statistical properties. The purpose of this talk is to briefly present these robust ability estimators, and to derive their asymptotic distribution under mild regularity conditions. In particular, a simple formula for the asymptotic standard error (ASE) of these estimators is obtained (Magis, in press). Results of a simulation study that involves both presence and absence of cheating in the data generation process will be outlined. References: Magis, D. (in press). On the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models. British Journal of Mathematical and Statistical Psychology. Meijer, R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107-135. doi: 10.1177/01466210122031957 Mislevy, R. J., & Bock, R. D. (1982). Biweight estimates of latent ability. Educational and Psychological Measurement, 42, 725-737. doi: 10.1177/001316448204200302 Schuster, C., & Yuan, K.-H. (2011). Robust estimation of latent ability in item response models. Journal of Educational and Behavioral Statistics, 36, 720-735. doi: 10.3102/1076998610396890 Wainer, H., & Wright, B. D. (1980). Robust estimation of ability in the Rasch model. Psychometrika, 45, 373-391. doi: 10.1007/BF02293910 [less ▲]

Detailed reference viewed: 30 (1 ULg)
Full Text
Peer Reviewed
See detailAsymptotic standard errors of robust estimators of ability in dichotomous item response models
Magis, David ULg

Conference (2013, October 10)

In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods ... [more ▼]

In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods were introduced to lessen the impact of such aberrant responses onto the estimation process. The computation of asymptotic (i.e., large sample) standard errors (ASE) for these robust estimators, however, has not been fully considered yet. This paper focuses on a broad class of robust ability estimators, de ned by an appropriate selection of the weight function and the residual measure, for which the ASE is derived from the theory of estimating equations. The maximum likelihood (ML) and the robust estimators, together with their estimated ASE, are then compared through a simulation study. It is concluded that both the estimators and their ASE perform similarly in absence of response disturbances, while the robust estimator and its estimated ASE are less biased and outperform their ML counterparts in presence of response disturbances with large impact on the item response process. [less ▲]

Detailed reference viewed: 7 (3 ULg)
Full Text
Peer Reviewed
See detailA small overview of available computer software to support computerized adaptive testing
Magis, David ULg

Conference (2013, August 27)

Computerized adaptive testing (CAT) is becoming a central tool for testing and assessment. It offers many advantages over fixed (“paper-and-pencil”) methods, such as individualized assessment, reduction ... [more ▼]

Computerized adaptive testing (CAT) is becoming a central tool for testing and assessment. It offers many advantages over fixed (“paper-and-pencil”) methods, such as individualized assessment, reduction of fraud, and straightforward estimation of proficiency levels. CAT has been studied for decades and remains an up-to-date research field in psychometrics and educational science. Practical CAT administration, however, is less frequently considered in such studies. Assigning CAT to respondents requires both the sufficient availability of computer machines, and the use of a powerful and easy-to-use CAT software. With the fast increase of computer resources at moderate cost, the availability of computer machines is becoming a less central, yet important, issue in the practical assessment of CAT tests. The choice of an accurate CAT software, on the other hand, should be guided by its flexibility, its underlying statistical modeling, and its user-friendly potential. According to the type of research or data analysis, some CAT software might be preferred to another. It is therefore important for the researcher or the clinician to know about the current availability of such software, in line with current research and practice in the CAT framework. Moreover, these software should allow enough flexibility to incorporate updates and new theoretical developments, such as e.g., new rules for next item selection. This talk proposes a simple and user-oriented presentation of several CAT software that are currently available. The software to be presented are: the Firestar software (Choi, 2009), the R package catR (Magis & Raîche, 2012), the R package catIrt (Nydick, 2012) and the CAT web-platform Concerto (Kosinski & Rust, 2011). The first three are non-commercial software, while Concerto is a web interface between end users (willing to develop computerized assessment tests) and catR (as underlying routine software). Both R packages are written to be most useful for researchers, without end-user interface, and are therefore less appealing for applied researchers who are not familiar with R. Yet, they offer flexible solutions by means of many options to optimize the design of the test and generate many response patterns for further analyses. Also, they can be easily integrated as sub-routines for more sophisticated CAT software. Firestar provides a user interface and makes all necessary computations with underlying R code. This talk aims at focusing on freely available CAT software. For this reason, only the four aforementioned programs will be presented, although it exists other, commercial CAT software such as e.g., the CATSim software (Assessment Systems Corporation, 2012). The different CAT software are briefly presented and their advantages and drawbacks, flexibility and usefulness are compared, mostly from the point of view of the applied researcher and clinician. The following criteria were retained for objective comparison: (a) their main goal of application; (b) the type of data and IRT modeling they can deal with; (c) the type of users they are focusing on; (d) their operating options; (e) their availability and flexibility for further improvements. A small demonstration of the R package catR will be proposed optionally, depending on time limitation. References Assessment Systems Corporation (2012). CATSim: Comprehensive simulation of computerized adaptive testing. St. Paul, MN. URL: http://www.assess.com/. Choi , S. W. (2009). Firestar: Computerized adaptive testing simulation program for polytomous item response theory models. Applied Psychological Measurement, 33, 644-645. Kosinski, M., & Rust, J. (2011). The development of Concerto: An open source online adaptive testing platform. Paper presented at the International Association for Computerized and Adaptive Testing (IACAT), Pacific Grove, CA. Magis, D., & Raîche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48, 1-31. Nydick, S. W. (2012). catIrt: An R package for simulating IRT-based computerized adaptive tests. R package version 0.3-0. [less ▲]

Detailed reference viewed: 157 (2 ULg)
Full Text
Peer Reviewed
See detailEquivalence of weighted likelihood and Jeffreys modal estimation of proficiency under polytomous item response models
Magis, David ULg

Conference (2013, July 23)

This talk focuses on two proficiency level estimators in item response theory (IRT) framework: the weighted likelihood estimator (WLE) and the Jeffreys modal estimator (JME), that is, the usual Bayes ... [more ▼]

This talk focuses on two proficiency level estimators in item response theory (IRT) framework: the weighted likelihood estimator (WLE) and the Jeffreys modal estimator (JME), that is, the usual Bayes modal estimator with Jeffreys’ non-informative prior. With dichotomously scored items, the WLE and the JME are completely equivalent under the two-parameter logistic model, while remarkable relationships were established under the three-parameter logistic model. The purpose of this talk is to extend such comparison to polytomously scored items. It is shown that both WLE and JME are also equivalent for two broad classes of polytomous IRT models, including, among others, the (modified) graded response model, the (generalized) partial credit model, the rating scale model and the nominal response model. Parallelisms with dichotomously scored items are drawn. An example from a real data set is used to illustrate this finding. [less ▲]

Detailed reference viewed: 18 (4 ULg)
Peer Reviewed
See detailProposition de nouveaux indices de détection de patrons de réponses inappropriés dans le contexte des enquêtes et des épreuves d’évaluation des apprentissages
Béland, Sébastien; Raîche, Gilles; Magis, David ULg

Conference (2013, June 07)

Il n’est pas rare de voir des étudiants répondre de façon inappropriée à une épreuve d’évaluation comportant des items à réponses choisies. Par exemple, certains individus peuvent tricher alors que ... [more ▼]

Il n’est pas rare de voir des étudiants répondre de façon inappropriée à une épreuve d’évaluation comportant des items à réponses choisies. Par exemple, certains individus peuvent tricher alors que d’autres peuvent tenter de se sous-classer intentionnellement à un examen. Plusieurs approches ont été développées pour faire la détection de ce type d’étudiants. À ce jour, l’approche la plus prometteuse est l’utilisation d’indice de détection de type person‐fit (Meijer et Sijtsma, 2001). L’indice de détection lz (Drasgow, Levine et Williams, 1985) est fort probablement le plus utilisé et le plus connu d’entre tous. Malheureusement, cet indice est fortement affecté par le fait que l’habileté des étudiants est estimée et non pas réelle ; ce qui peut biaiser son calcul (Molenaar et Hoijtink, 1990). Pour pallier ce problème, Snijders (2001) a proposé une correction qui permet de diminuer considérablement le biais associé à la moyenne et à la variance de l’indice lz. Dans le cadre de notre projet doctoral, nous nous inspirerons de la suggestion de Snijders (2001) pour corriger deux autres indices de détection de patrons de réponses inappropriés : l’infit mean square (u) et l’outfit mean square (w). À cette fin, nous utiliserons une approche monte carliste afin d’investiguer plus en détails l’erreur de type 1 et la puissance de ces autres indices. Nos résultats préliminaires, que nous présenterons lors de cette communication, montrent que ces autres indices corrigés semblent eux aussi plus efficaces que leur version traditionnelle sans correction. [less ▲]

Detailed reference viewed: 8 (1 ULg)