References of "Magis, David"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailRectal cancer treatment in a teaching hospital
Verdin, Vanessa; Weerts, Joseph; Francart, David et al

in Acta Chirurgica Belgica (in press)

Background: Rectal adenocarcinomas surgery morbidity and mortality might be impaired by neoadjuvant therapy. We performed this retropsective study to be compared with the PROCARE study running afterwards ... [more ▼]

Background: Rectal adenocarcinomas surgery morbidity and mortality might be impaired by neoadjuvant therapy. We performed this retropsective study to be compared with the PROCARE study running afterwards. Methods: We performed a retrospective study of 95 patients operated on for rectal denocarcinoma in a single institution during the period 2007-2009. We used logistic regression to estimate the relationship between possible predictive parameters of AL. Results: The laparoscopic approach is favored in 63.1% of the cases with a conversion rate of 11.6%, mainly in man (6 out of 7). For low rectal cancer though, laparotomy was the first choice (92.3%). From a carcinological point of view, laparoscopy allowed a complete tumor resection according to the PME (n=27) and TME (n=26) standards. Multivariate analysis revealed that women, lower BMI, lower rectum tumor, laparoscopic surgery, neoadjuvant treatment and anal suture were associated with higher risk of AL. The mean hospital stay was 15.4 days (3 – 46 days). In-hospital mortality was 3.1%. Adjuvant chemotherapy was completed in 42.1% of the patients. Despite these treatments, we registered a recurrence rate of 26.6%. Of these, 72% were distally localized and 12% exclusively locally. Among the patients operated on by laparoscopy, there was one local recurrence and one local with distant metastases (3.7%). The one- and three-years survival rates were 91.5% and 80.4% respectively. Conclusions: Our study showed a higher rate of AL than expected (18%). In our series recorded in PROCARE-Home, our leak rate has dropped to 10%. It may be indicating a positive effect of PROCARE. [less ▲]

Detailed reference viewed: 17 (2 ULg)
Full Text
Peer Reviewed
See detailAn item analysis of the French version of the Test for Reception of Grammar among children and adolescents with Down syndrome or intellectual disability of undifferentiated etiology
Facon, Bruno; Magis, David ULg

in Journal of Speech, Language, and Hearing research (in press)

Purpose: An item analysis of Bishop’s (1983) Test for Reception of Grammar (TROG) in its French version (F-TROG, Lecocq, 1996) was conducted to determine whether the difficulty of items is similar for ... [more ▼]

Purpose: An item analysis of Bishop’s (1983) Test for Reception of Grammar (TROG) in its French version (F-TROG, Lecocq, 1996) was conducted to determine whether the difficulty of items is similar for participants with or without intellectual disability (ID). Method: In Study 1, responses to the 92 F-TROG items by 55 participants with Down syndrome (DS), 55 with ID of undifferentiated etiology (UND) and 55 typical children (TYP) matched on their F-TROG total score were compared using the transformed item difficulties method, a statistical approach designed to detect differential item functioning (DIF) between groups. In Study 2, an additional comparison involving 526 TYP participants and 526 participants with UND was conducted to increase the statistical power of the analysis. Results: The difficulty of items was highly similar whatever the sample size or clinical status of participants. Fewer than 3.5 % of the items were flagged as showing DIF. Conclusions: Tests such as the TROG can be used with confidence in clinical practice as well as in research studies comparing participants with or without ID. Methods designed for investigating potential internal test bias – such as done here – should be more regularly employed in the developmental disability field to affirm the absence of DIF. [less ▲]

Detailed reference viewed: 21 (1 ULg)
Full Text
Peer Reviewed
See detailOn the finiteness of the weighted likelihood estimator of ability
Magis, David ULg; Verhelst, Norman

in Psychometrika (in press)

The purpose of this note is to focus on the finiteness of the weighted likelihood estimator (WLE) of ability in the context of dichotomous and polytomous item response theory (IRT) models. It is ... [more ▼]

The purpose of this note is to focus on the finiteness of the weighted likelihood estimator (WLE) of ability in the context of dichotomous and polytomous item response theory (IRT) models. It is established that the WLE always returns finite ability estimates. This general result is valid for dichotomous (one-, two-, three- and four-parameter logistic) IRT models, the class of polytomous difference models and divide-by-total models, independently of the number of items, the item parameters and the response patterns. Further implications of this result are outlined. [less ▲]

Detailed reference viewed: 15 (2 ULg)
Full Text
Peer Reviewed
See detailÉtude de nouveaux indices de détection de la réponse au hasard et de l’inattention selon différentes valeurs de l’habileté dans le contexte de la modélisation de Rasch
Béland, Sébastien; Raîche, Gilles; Magis, David ULg et al

in Mesure et Evaluation en Education (in press)

Certains étudiants peuvent répondre au hasard ou être inattentifs dans une situation de testing. Plusieurs approches ont déjà été développées pour détecter ce type de réponse. Parmi celles-ci ... [more ▼]

Certains étudiants peuvent répondre au hasard ou être inattentifs dans une situation de testing. Plusieurs approches ont déjà été développées pour détecter ce type de réponse. Parmi celles-ci, l’utilisation d’indices de détection (person-fit indexes) de patrons de réponses inappropriés est l’approche qui est la plus étudiée et qui semble la plus prometteuse. Dans le cadre de cette étude, nous nous concentrons sur trois indices de détection populaires qui présentent des caractéristiques permettant d’en faciliter l’interprétation: lz, ZU et ZW. Des études antérieures ont montré que ces trois indices sont fortement affectés par le fait que l’habileté d’un étudiant est estimée plutôt que réelle. Snijders (2001) a proposé une version corrigée de l’indice lz (nommée lz*) afin de tenir compte de cette difficulté. Magis, Béland et Raîche (2014) ont déjà corrigé deux autres indices selon l’approche de Snijders: U* et W*. Il reste cependant à analyser plus en détail le comportement des indices corrigés lz*, U* et W* et des indices standardisés lz, ZU et ZW. Pour ce faire, nous effectuons deux études selon différentes valeurs de l’habileté, soit une analyse des erreurs de type I des indices (probabilité de se tromper en identifiant un patron de réponses inapproprié) et une analyse de leur puissance de détection. Ces analyses permettront de démontrer que ce sont généralement les indices corrigés lz* et W* qui sont les plus intéressants à utiliser puisque leurs scores suivent approximativement la loi normale et qu’ils permettent de bien détecter la réponse au hasard et l’inattention. [less ▲]

Detailed reference viewed: 17 (0 ULg)
Full Text
Peer Reviewed
See detailComputerized adaptive testing with R: Recent updates of the package catR
Magis, David ULg; Barrada, Juan Ramon

in Journal of Statistical Software (in press)

The purpose of this paper is to list the recent updates of the R package catR. This package allows for generating response patterns under a computerized adaptive testing (CAT) framework with underlying ... [more ▼]

The purpose of this paper is to list the recent updates of the R package catR. This package allows for generating response patterns under a computerized adaptive testing (CAT) framework with underlying item response theory (IRT) models. Among the most important updates, well-known polytomous IRT models are now supported by catR; several item selection rules have been added; and it is now possible to perform post-hoc simulations. Some functions were also rewritten or withdrawn to improve the usefulness and performances of the package. [less ▲]

Detailed reference viewed: 43 (0 ULg)
Full Text
Peer Reviewed
See detailA cross-sectional analysis of developmental trajectories of vocabulary comprehension among children and adolescents with Down syndrome or intellectual disability of undifferentiated aetiology
Facon, Bruno; Courbois, Yannick; Magis, David ULg

in Journal of Intellectual & Developmental Disability (in press)

Background: This work seeks to expand our knowledge of developmental trajectories of subcomponents of the language systems of individuals with intellectual disability (ID). It aims to explore how general ... [more ▼]

Background: This work seeks to expand our knowledge of developmental trajectories of subcomponents of the language systems of individuals with intellectual disability (ID). It aims to explore how general and relational vocabularies evolve as a function of cognitive level. Method: Developmental trajectories of general and relational vocabulary comprehension were compared among typically developing children (TYP) and children and adolescents with ID of undifferentiated aetiology (UND) or Down syndrome (DS). Results: Comparisons between TYP and UND participants showed no interaction between cognitive level and diagnostic status for general vocabulary, and only a very weak interaction for relational vocabulary. Comparisons between TYP and DS participants failed to reveal groupspecific trajectories. Performance in general vocabulary was higher than in relational vocabulary for both UND and DS participants. Conclusion: The developmental trajectories of vocabulary appear to be globally comparable for participants with or without ID. [less ▲]

Detailed reference viewed: 40 (4 ULg)
Full Text
Peer Reviewed
See detailPassage de l’administration fixe d’un test à une administration adaptative : application au TCALS-II
Magis, David ULg; Raîche, Gilles

in Raîche, Gilles; Ndinga, Pascal; Meunier, Hélène (Eds.) L'interdisciplinarité de la mesure et de l’évaluation (in press)

La problématique du passage de l’administration fixe (de type papier-crayon) à une administration adaptative d’un test est étudiée. Une méthode en deux étapes est présentée. Dans un premier temps, des ... [more ▼]

La problématique du passage de l’administration fixe (de type papier-crayon) à une administration adaptative d’un test est étudiée. Une méthode en deux étapes est présentée. Dans un premier temps, des patrons de réponses sont générés selon une administration fixe, dans le but de déterminer des valeurs admissibles de l’erreur-type d’estimation du niveau d’habileté. Ensuite, ces valeurs sont utilisées comme critères d’arrêt lors d’une administration adaptative du même test. La longueur du test est alors considérée pour évaluer la qualité du test par rapport à sa version fixe. Le test de classement en anglais, langue seconde, au collégial (TCALS-II) est utilisé en guise d’illustration. Il est établi qu’une administration adaptative du TCALS-II permettrait de réduire sensiblement la longueur du test, sans perte de qualité de l’estimation des niveaux d’habileté. Toutefois, cette amélioration est limitée aux sujets ne présentant pas un niveau d’habileté trop faible ou trop important. [less ▲]

Detailed reference viewed: 43 (13 ULg)
Peer Reviewed
See detailLes indices de détection de patron de réponse inapproprié de type carré-moyens non-pondérés: à manipuler avec précaution!
Beland, Sebastien; Raiche, Gilles; Magis, David ULg et al

Conference (2016, November 17)

Le carré-moyen non-pondéré U comme indice de détection (unweighted mean square statistics) (Wright & Stone, 1979) et sa version standardisée ZU (Wright, 1980) sont parmi les indices de détection de ... [more ▼]

Le carré-moyen non-pondéré U comme indice de détection (unweighted mean square statistics) (Wright & Stone, 1979) et sa version standardisée ZU (Wright, 1980) sont parmi les indices de détection de patrons de réponse inappropriés paramétriques les plus connus. On retrouve l’un et/ou l’autre de ces indices dans des logiciels commerciaux ou gratuits tels que RUMM2030, FACETS ou la librairie R eRm en plus d’en faire mention dans les articles les plus cités sur le sujet (Karabatsos, 2003; Meijer et Sijtsma, 2001). Durant les dernières décennies, quelques voix se sont élevées afin de sensibiliser les utilisateurs aux problèmes que présentent U et ZU (Karabatsos, 2000; Smith, 1991). Ils semblent, en effet, présenter des scores extrêmes qui rendraient leur utilisation inadéquate dans certaines situations. Cette présentation aura comme objectif d’étudier les limites des indices U et ZU. Nous utiliserons une étude de simulation et différents modèles de réponses à l’item dichotomiques afin de soutenir notre propos. Les auteurs de l’étude concluront leur présentation en discutant de l’importance de manipuler ces indices avec précaution. [less ▲]

Detailed reference viewed: 13 (1 ULg)
Full Text
Peer Reviewed
See detailOpen source programming: a new hope for psychometric research
Magis, David ULg

Conference (2016, July 14)

Current psychometric research is most often supported by computer software. New research perspectives often imply intensive simulation studies to validate the tested theories or hypotheses, and therefore ... [more ▼]

Current psychometric research is most often supported by computer software. New research perspectives often imply intensive simulation studies to validate the tested theories or hypotheses, and therefore require accurate, fast and stable implementation. To this regards, open source programming (such as in the R language) is a promising approach allowing for flexible implementation, data generation, replication of studies, and worldwide dissemination. The purpose of this talk is to illustrate how psychometrics and open source programming (with special emphasis on the R language) can interact and contribute to each other, by means of some selected examples. Several topics will be illustrated, among others: why open source programming is (to my opinion) as important as psychometric research; why we need for stable and complete implementation of psychometric and statistical routines for research purposes (for e.g., CAT); how accurate implementation of IRT routines can lead to unexpected theoretical results; why (and how) open source software can be valued as research output. Most examples will arise from the CAT framework and the R package catR for simulating CAT patterns. [less ▲]

Detailed reference viewed: 23 (2 ULg)
Peer Reviewed
See detailExamine the effects of two adjustments to the lz statistic
Riley, Barth; Magis, David ULg

Conference (2016, July 14)

Conformity to a known distribution and sensitivity to response aberrance are desirable properties of person-fit statistics. This simulation study examined the joint and independent effects of two ... [more ▼]

Conformity to a known distribution and sensitivity to response aberrance are desirable properties of person-fit statistics. This simulation study examined the joint and independent effects of two adjustments to the standardized log-likelihood statistic (lz): (1) correction of the negatively skewed distribution of lz (Snijders, 2001), and (2) improving the sensitivity of the statistic by employing more accurate estimates of item response probability using symmetric functions (Dimitrov and Smith, 2006). Data were simulated using three test lengths (10, 20, 30 items). Data containing misfitting response patterns were simulated using three aberrant response patterns (cheating, guessing, and inattentiveness), and three levels of aberrance (i.e., proportion of item responses affected by misfit; 10%, 30% and 50%). Data containing no simulated misfitting response patterns were also generated for each test length. Non-misfitting responses were generated using the dichotomous Rasch measurement model. For each combination of independent variables, a dataset was generated consisting of 5,000 simulees. Four fit statistics were compared: lz, lz* (Snijders adjustment), lzSYM (Dimitrov and Smith adjustment), and lzSYM* (both adjustments). Mean Type I error rates were ≤ 0.1 across all conditions. The lz* statistic produced the best control of Type I error, which was often below the nominal Type I error rate, whereas the empirical Type I error rate for the unadjusted lz statistic most closely approximated the nominal rate. In contrast, lzSYM and lzSYM* yielded empirical Type I error rates larger than the nominal rate, with the discrepancy being particularly pronounced as the length of the test decreased. As might be expected, power to detect misfitting response patterns increased with test length and with the percentage of misfitting response patterns in the sample. Both lzSYM and lzSYM* evidenced improved power in detecting misfitting response patterns compared to lz and lz*, particularly for guessing response patterns and/or on shorter (i.e., 10 item) tests. [less ▲]

Detailed reference viewed: 15 (1 ULg)
Full Text
Peer Reviewed
See detailOn the use of ROC curves in DIF simulation studies
Magis, David ULg; Tuerlinckx, Francis

Conference (2016, July 14)

Simulation studies are often used to compare methods to detect differential item functioning (DIF). However, comparing the performance of such methods can become complicated when the identification of DIF ... [more ▼]

Simulation studies are often used to compare methods to detect differential item functioning (DIF). However, comparing the performance of such methods can become complicated when the identification of DIF items relies on statistics based on pre-defined significance level or on pre-established cutoff values. DIF methods based on conceptually different approaches may therefore become incomparable in terms of summary DIF statistics such as false alarm rate or hit rate. The purpose of this talk is to overcome this analytic issue by introducing receiver operating characteristic (ROC) curves in this context. ROC curves allow for global comparison of methods’ performances by computing pairs of (false alarm, hit) rates and representing them on a common scatter plot. Several summary ROC statistics can be considered for further analysis. The application of the ROC curve methodology, together with its limitation and possible extensions, is illustrated by a simple simulation study that compares three score-based DIF methods (Mantel-Haenszel, standardization and Delta plot). [less ▲]

Detailed reference viewed: 11 (1 ULg)
Peer Reviewed
See detailComputerized adaptive testing and multistage testing with R
Magis, David ULg; Yan, Duanli; von Davier, Alina

Conference (2016, July 11)

The goal of this workshop is to provide a practical (and brief) overview of the theory on computerized adaptive testing (CAT) and multistage testing (MST), and illustrate the methodologies and ... [more ▼]

The goal of this workshop is to provide a practical (and brief) overview of the theory on computerized adaptive testing (CAT) and multistage testing (MST), and illustrate the methodologies and applications using R open source language and several data examples. The implementations rely on the R packages catR and mstR that have been already or are being developed and include some of the newest research algorithms developed by the authors. This workshop will cover several topics: the basics of R, theoretical overview of CAT and MST, CAT and MST designs, assembly methodologies, catR and mstR packages, simulations and applications. The intended audience for the workshop is undergraduate/graduate students, faculty, researchers, practitioners at testing institutions, and anyone in psychometrics, measurement, education, psychology and other fields who is interested in computerized adaptive and multistage testing, especially in practical implementations of simulation using R. [less ▲]

Detailed reference viewed: 22 (1 ULg)
Full Text
See detailFiniteness of the weighted likelihood estimator and applications to CAT
Magis, David ULg

Scientific conference (2016, July 07)

The purpose of this talk is to present some recent research on the weighted likelihood estimator (WLE) of ability in item response theory (IRT). This estimator is quite commonly used as an alternative to ... [more ▼]

The purpose of this talk is to present some recent research on the weighted likelihood estimator (WLE) of ability in item response theory (IRT). This estimator is quite commonly used as an alternative to usual maximum likelihood and Bayesian estimators. However, the uestion of providing finite ability estimates was left unsolved and led to some controversy. Recently, Magis and Verhelst (in press) established that the WLE always returns finite values, independently of the IRT model, the number of items, and the item responses. This general result will be briefly outlined. The finiteness of the WLE has straightforward impact within the field of computerized adaptive testing (CAT). One technical and crucial issue in CAT is to accurately estimate the latent ability at the early stages of the adaptive process, when only a few items are available. Currently heuristic adjustments are adviced to avoid infinite estimates with only a few item responses. In this talk it will be highlighted how the use of the WLE throughout the CAT can be a promising and performant approach to solve this issue. [less ▲]

Detailed reference viewed: 26 (3 ULg)
Full Text
See detailComputerized Adaptive Testing
Braeken, Johan; Magis, David ULg; Stillwell, David

Scientific conference (2016, April 25)

Why ask a person to answer a problem item, when you a priori know they won’t be able to solve it? It is a waste of time and resources, and you won’t gain any new information; this is both inefficient and ... [more ▼]

Why ask a person to answer a problem item, when you a priori know they won’t be able to solve it? It is a waste of time and resources, and you won’t gain any new information; this is both inefficient and ineffective. In contrast, computerized adaptive testing (CAT) is based on the principle that more information can be gained when one tailors the test towards the level of the person being tested. Computational and statistical techniques from item response theory (IRT) and decision theory are combined to implement a test that can behave interactively during the test process and adapts towards the level of the person being tested.The implementation of such a CAT relies on an iterative sequential algorithm that searches the pool of available items (a so-called item bank) for the optimal item to administer based on the current estimate of the person’s level (and optional external constraints). The subsequent response on this item provides new information to update the person’s proficiency estimate. This selection-responding-updating process continues until specified stop criteria have been reached. The consequence of such an adaptive test administration is that you get an individualized tailored test that is more efficient and more effective. Because you have less of a mismatch between the level of the test and the level of the test taker, there is a lesser burden for the latter and a higher precision for the former, and this with fewer items than a traditional fixed item-set test format. Furthermore, because it is computerized and sequential, test performance can be continuously monitored and reported directly after test completion. Item response models come into play to ensure comparable scores of these individual tailored tests by putting them on the same measurement scale and to precalibrate,the psychometric parameters of the items that are part of the item bank on which the sequential iterative algorithm operates. The workshop intends to tackle issues encountered during the setup of a computerized adaptive test, starting from the design towards the actual delivery of a CAT. [less ▲]

Detailed reference viewed: 44 (1 ULg)
Full Text
See detailAdaptive versus linear testing: selected examples from psychology, education and medicine
Magis, David ULg

Scientific conference (2016, March 08)

Most often data from psychological, educational or medical research are collected by administering questionnaires to the participants. Such questionnaires are usually made of the same set of items (i.e ... [more ▼]

Most often data from psychological, educational or medical research are collected by administering questionnaires to the participants. Such questionnaires are usually made of the same set of items (i.e. questions) and are designed to precisely target the studied latent trait: this is referred to as "linear testing". However, linear testing can be counterproductive in some specific situations. For instance, not all items can accurately target the latent trait (e.g., "easy" items are not informative for "highly able" participants), so that the duration of the test can be uselessly extended. Adaptive testing is an emerging paradigm that aims at selecting and administering each item on the basis of previously selected items and the responses of the participants. The selectuion of the item is made optimally so that the most informative item for the respondent is chosen. This permits, among others, to better target the true latent trait to be estimated, shortening then the test duration. In this talk, linear and adaptive testing will be sketched from an educational testing approach. Then, selected examples from the psychological, educational and medical litterature will be briefly reviewed to illustrate the potentially usefulness of adaptive testing. Pros and cons of this method will also be outlined. [less ▲]

Detailed reference viewed: 33 (3 ULg)
Peer Reviewed
See detailComputerized adaptive and multi-stage testing with R
Yan, Duanli; Magis, David ULg

Conference (2016, February 19)

Computerized Adaptive Testing (CAT) has greatly improved the accuracy and efficiency of psychological testing for decades. Multistage Testing (MST) has received much of attention recently. MST is similar ... [more ▼]

Computerized Adaptive Testing (CAT) has greatly improved the accuracy and efficiency of psychological testing for decades. Multistage Testing (MST) has received much of attention recently. MST is similar to CAT such that it allows the adaptation of the difficulty of the test to the level of ability of a test taker. Specifically, in MST, items are interactively selected for each test taker, but rather than selecting individual items, groups of items are selected and the test is built in stages. Over the last decade, researchers have investigated ways for an MST to incorporate most of the advantages from CAT and linear testing, while minimize their disadvantages. These features include testing efficiency and accuracy, greater control of test content, more robust item review, as well as simplified test assembly and administration. Therefore, MST can be an effective compromise between CAT and linear testing, embedding features and benefits from both designs. Thus, MST becomes of more and more interest to researchers and practitioners as technology advances. This presentation will first provide a general overview of a multistage test (MST) design and its important concepts and processes. It will then present the latest development on CAT and MST using R, the mstR package. The presentation will also illustrate how to simulate MST administrations using mstR package, and discuss some practical issues and considerations for MST from design to applications. [less ▲]

Detailed reference viewed: 85 (0 ULg)
Full Text
Peer Reviewed
See detailEfficient standard error formulas of ability estimators with dichotomous item response models
Magis, David ULg

in Psychometrika (2016), 81

This paper focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered and ability estimators are defined ... [more ▼]

This paper focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered and ability estimators are defined from a very restricted set of assumptions and formulas. This approach encompasses most standard methods such as maximum likelihood, weighted likelihood, maximum a posteriori and robust estimators. A general formula for the ASE is derived from the theory of M-estimation. Well-known results are found back as particular cases for the maximum and robust estimators, while new ASE proposals for the weighted likelihood and maximum a posteriori estimators are presented. These new formulas are compared to traditional ones by means of a simulation study under Rasch modeling. [less ▲]

Detailed reference viewed: 21 (6 ULg)
Full Text
Peer Reviewed
See detailÉtude comparative de nouveaux indices de détection de la réponse qui s’apparentent au hasard et à l’inattention
Béland, Sébastien; Raîche, Gilles; Magis, David ULg et al

Conference (2015, November 19)

Certains étudiants peuvent répondre au hasard ou être inattentifs dans une situation de testing. Plusieurs approches ont déjà été développées pour détecter ce type de réponse (Zickar et Drasgow, 1996 ... [more ▼]

Certains étudiants peuvent répondre au hasard ou être inattentifs dans une situation de testing. Plusieurs approches ont déjà été développées pour détecter ce type de réponse (Zickar et Drasgow, 1996). Parmi celles-ci, l’utilisation d’indices de détection de patrons de réponses inappropriés (person-fit indices) est l’approche qui est la plus étudiée et qui semble la plus prometteuse (Karabatsos, 2003; Meijer et Sijtsma, 2001). Dans le cadre de cette étude, nous nous concentrerons sur trois indices de détection populaires qui présentent des caractéristiques permettant d’en faciliter l’interprétation : lz (Drasgow, Levine et Williams, 1985), ZU (Wright et Masters, 1982) et ZW (Wright et Masters, 1982). Toutefois, il s’est avéré qu’ils sont tous fortement affectés par le fait que l’habileté d’un étudiant est estimée plutôt que réelle (Li et Olejnik, 1997; Molenaar et Hoijtink, 1990). Voilà pourquoi Snijders (2001) a proposé une version corrigée de l’indice lz (nommée lz*) qui prend en considération ce problème important. Nous avons déjà appliqué la correction de Snijders aux indices U et W en créant les indices ZU* et ZW* (Magis, Béland et Raîche, 2014). L’objectif de cette étude sera d’examiner le comportement des indices corrigés lz*, ZU* et ZW* et de leur version standardisée. Pour ce faire, nous effectuerons trois études différentes : une analyse descriptive des scores des indices, une analyse des erreurs de type I et une analyse de leur puissance de détection. Les analyses ont démontré que ce sont les indices corrigés lz* et ZW* qui sont les plus intéressants à utiliser puisque leurs scores suivent approximativement la loi N(0,1) et puisqu’ils permettent de bien détecter les réponses qui s’apparentent au hasard et à l’inattention. [less ▲]

Detailed reference viewed: 26 (0 ULg)
Full Text
See detailReceiver operating characteristic (ROC) curves and their use in psychometric simulation studies
Magis, David ULg

Scientific conference (2015, October 27)

Simulation studies are commonly used in psychometric research to compare existing methods or to highlight the outperformance of a newly developed approach with respect to standard techniques. In several ... [more ▼]

Simulation studies are commonly used in psychometric research to compare existing methods or to highlight the outperformance of a newly developed approach with respect to standard techniques. In several specific situations, the output of performance evaluations can be summarized by pairs of statistics such as false alarm and hit rates (or Type I error and power). Adequate analysis of these rates, however, is often subject to discussion. The purpose of this ongoing work (jointly with Francis Tuerlinckx) is to advocate the usefulness of receiver operating characteristic (ROC) curves to analyze the output of simulation studies in terms of pairs of summary statistics. Two particular psychometric applications will be considered and illustrated: differential item functioning (DIF) and person fit identification. By means of simple examples, ROC curves will be shown to be efficient in capturing more output than standard analyses, thus allowing for a more refined and precise discussion of the study results. Limitations and other extensions will also be outlined. [less ▲]

Detailed reference viewed: 65 (1 ULg)
Full Text
Peer Reviewed
See detailEfficient standard error formulas of ability estimators in item response theory
Magis, David ULg

Conference (2015, October 15)

This talk focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered, and ability estimators are defined ... [more ▼]

This talk focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered, and ability estimators are defined from a very restricted set of assumptions and formulas. This approach encompasses most standard methods such as maximum likelihood, weighted likelihood, maximum a posteriori, and robust estimators. A general formula for the ASE is derived from the theory of M-estimation. Well-known results are found back as particular cases for the maximum and robust estimators, while new ASE proposals for the weighted likelihood and maximum a posteriori estimators are presented. These new formulas are compared to traditional ones by means of a simulation study under Rasch modeling. [less ▲]

Detailed reference viewed: 28 (2 ULg)