References of "Magis, David"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailÉtude comparative de nouveaux indices de détection de la réponse qui s’apparentent au hasard et à l’inattention
Béland, Sébastien; Raîche, Gilles; Magis, David ULg et al

Conference (2015, November 19)

Certains étudiants peuvent répondre au hasard ou être inattentifs dans une situation de testing. Plusieurs approches ont déjà été développées pour détecter ce type de réponse (Zickar et Drasgow, 1996 ... [more ▼]

Certains étudiants peuvent répondre au hasard ou être inattentifs dans une situation de testing. Plusieurs approches ont déjà été développées pour détecter ce type de réponse (Zickar et Drasgow, 1996). Parmi celles-ci, l’utilisation d’indices de détection de patrons de réponses inappropriés (person-fit indices) est l’approche qui est la plus étudiée et qui semble la plus prometteuse (Karabatsos, 2003; Meijer et Sijtsma, 2001). Dans le cadre de cette étude, nous nous concentrerons sur trois indices de détection populaires qui présentent des caractéristiques permettant d’en faciliter l’interprétation : lz (Drasgow, Levine et Williams, 1985), ZU (Wright et Masters, 1982) et ZW (Wright et Masters, 1982). Toutefois, il s’est avéré qu’ils sont tous fortement affectés par le fait que l’habileté d’un étudiant est estimée plutôt que réelle (Li et Olejnik, 1997; Molenaar et Hoijtink, 1990). Voilà pourquoi Snijders (2001) a proposé une version corrigée de l’indice lz (nommée lz*) qui prend en considération ce problème important. Nous avons déjà appliqué la correction de Snijders aux indices U et W en créant les indices ZU* et ZW* (Magis, Béland et Raîche, 2014). L’objectif de cette étude sera d’examiner le comportement des indices corrigés lz*, ZU* et ZW* et de leur version standardisée. Pour ce faire, nous effectuerons trois études différentes : une analyse descriptive des scores des indices, une analyse des erreurs de type I et une analyse de leur puissance de détection. Les analyses ont démontré que ce sont les indices corrigés lz* et ZW* qui sont les plus intéressants à utiliser puisque leurs scores suivent approximativement la loi N(0,1) et puisqu’ils permettent de bien détecter les réponses qui s’apparentent au hasard et à l’inattention. [less ▲]

Detailed reference viewed: 30 (0 ULg)
Full Text
See detailReceiver operating characteristic (ROC) curves and their use in psychometric simulation studies
Magis, David ULg

Scientific conference (2015, October 27)

Simulation studies are commonly used in psychometric research to compare existing methods or to highlight the outperformance of a newly developed approach with respect to standard techniques. In several ... [more ▼]

Simulation studies are commonly used in psychometric research to compare existing methods or to highlight the outperformance of a newly developed approach with respect to standard techniques. In several specific situations, the output of performance evaluations can be summarized by pairs of statistics such as false alarm and hit rates (or Type I error and power). Adequate analysis of these rates, however, is often subject to discussion. The purpose of this ongoing work (jointly with Francis Tuerlinckx) is to advocate the usefulness of receiver operating characteristic (ROC) curves to analyze the output of simulation studies in terms of pairs of summary statistics. Two particular psychometric applications will be considered and illustrated: differential item functioning (DIF) and person fit identification. By means of simple examples, ROC curves will be shown to be efficient in capturing more output than standard analyses, thus allowing for a more refined and precise discussion of the study results. Limitations and other extensions will also be outlined. [less ▲]

Detailed reference viewed: 71 (1 ULg)
Full Text
Peer Reviewed
See detailEfficient standard error formulas of ability estimators in item response theory
Magis, David ULg

Conference (2015, October 15)

This talk focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered, and ability estimators are defined ... [more ▼]

This talk focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered, and ability estimators are defined from a very restricted set of assumptions and formulas. This approach encompasses most standard methods such as maximum likelihood, weighted likelihood, maximum a posteriori, and robust estimators. A general formula for the ASE is derived from the theory of M-estimation. Well-known results are found back as particular cases for the maximum and robust estimators, while new ASE proposals for the weighted likelihood and maximum a posteriori estimators are presented. These new formulas are compared to traditional ones by means of a simulation study under Rasch modeling. [less ▲]

Detailed reference viewed: 28 (2 ULg)
Full Text
Peer Reviewed
See detailEmpirical comparison of scoring rules at early stages of CAT
Magis, David ULg

Conference (2015, September 15)

Usual scoring rules in CATs include maximum likelihood (ML), weighted likelihood (WL) and Bayesian approaches. However, at early stages of adaptive testing, only a few item responses are available so the ... [more ▼]

Usual scoring rules in CATs include maximum likelihood (ML), weighted likelihood (WL) and Bayesian approaches. However, at early stages of adaptive testing, only a few item responses are available so the amount of information is very limited and in addition constant patterns (i.e. only correct or only incorrect responses) are often observed, yielding ML scoring intractable. Specific scoring rules (such as fixed- or variable stepsize adjustments) were developed for that purpose. However recent research highlighted that both Bayesian and WL scoring rules may provide finite values even with small sets of items. The purpose of this presentation is twofold: (a) to make a quick review of available scoring rules at early stages of CAT, and (b) to present empirical results from a simulation study that compares those scoring rules. More precisely, three scoring scenarios will be investigated: stepsize adjustment followed by ML, Bayes or WL followed by ML, and constant scoring rule throughout the CAT. These methods will be compared by means of simulated item banks and under various CAT scenarios for next item selection and stopping rules. Empirical results will be presented and practical guidelines for early stage scoring will be outlined. [less ▲]

Detailed reference viewed: 23 (1 ULg)
Full Text
Peer Reviewed
See detailFrom psychometric research to implementation and back: selected examples
Magis, David ULg

Conference (2015, February 13)

Current psychometric research is most often supported by computer software. New research perspectives often imply intensive simulation studies to validate the tested theories or hypotheses, and therefore ... [more ▼]

Current psychometric research is most often supported by computer software. New research perspectives often imply intensive simulation studies to validate the tested theories or hypotheses, and therefore require accurate implementation as e.g., R packages. However, it may happen that unexpected psychometric phenomena are detected almost accidentally, through such implementations with basically totally different purposes. This talk will illustrate this phenomenon by means of two recent examples from item response theory (IRT) with polytomous models: (a) the equivalence between the weighted likelihood estimator (WLE) and the Bayes modal estimator with Jeffreys prior, and (b) the relationships between observed and expected information functions. Rather than focusing on the technical details, the purpose of this talk is to highlight how the results were identified first through R implementation, then confirmed by theoretical derivations. The talk concludes by advocating for flexible and stable open-source implementations (such as R packages) to support current and ongoing psychometric research. [less ▲]

Detailed reference viewed: 31 (4 ULg)
Peer Reviewed
See detailLe testing adaptatif informatisé : une brève introduction
Magis, David ULg

Conference (2015, January 27)

L’objet de cet exposé est de présenter les grands principes et concepts du testing adaptatif informatisé (TAI). Les éléments abordés sont : le TAI face au test fixe (papier-crayon), les principes généraux ... [more ▼]

L’objet de cet exposé est de présenter les grands principes et concepts du testing adaptatif informatisé (TAI). Les éléments abordés sont : le TAI face au test fixe (papier-crayon), les principes généraux du TAI (banque d’items, estimation provisoire et finale de la compétence, sélection des items à administrer, règles d’arrêt), les principes spécifiques au TAI (contrôle de l’exposition des items, équilibrage du contenu des tests). L’exposé se veut didactique et général afin de dessiner les contours du TAI. Il se termine par un état de l’art sur les recherches actuelles sur le TAI. [less ▲]

Detailed reference viewed: 90 (4 ULg)
Full Text
See detailIntroduction to item response theory (IRT) and computerized adaptive testing (CAT) with the R software
Magis, David ULg

Scientific conference (2015, January 13)

Item response theory (IRT) has become an important field of research for psychology and educational assessment. Recently, with the increase of computational power, several IRT-related topics have emerged ... [more ▼]

Item response theory (IRT) has become an important field of research for psychology and educational assessment. Recently, with the increase of computational power, several IRT-related topics have emerged, among others, computerized adaptive testing (CAT). The main aim of CAT is to provide a framework for individualized assessment by means of optimal item selection and administration to the test takers. CAT has several assets to linear (non-adaptive) testing: individualized assessment, limited risk of cheating or fraud, shorter tests providing the same amount of information as longer linear tests, automatic scoring and reporting at the end of the test. Practical use of CAT, however, remains limited so far due to several factors (lack of available large item banks, content validity and security, lack of suitable software for practical CAT assessment, ethical issues in administering different tests to estimate the same ability, etc.). The purpose of this workshop is threefold: (a) to provide a general overview of IRT and CAT, (b) to introduce the R software in a user-oriented way, as well as several IRT tools (including the package catR for CAT simulations), (c) to perform practical training sessions with the participants. The workshop will be a mix of oral presentations, demonstrations related to the R software, and practical sessions where participants will be invited to train with R and catR. The R software is an open-source platform for statistical inference and testing, graphical display and data visualization. It also holds several add-on packages for specific IRT purposes (item calibration, ability estimation, multidimensional scaling, equating, differential item functioning etc.). The R community is worldwide and proposes free exchanges of shared R packages through the CRAN (comprehensive R archive network). In this workshop, the R package catR will be examined and used in the practical sessions. [less ▲]

Detailed reference viewed: 219 (2 ULg)
Full Text
Peer Reviewed
See detailRepérer les enfants à risque de développer un trouble langagier en moins de 5 questions : mise au point d'un outil de dépistage rapide destiné aux enfants de 12 à 24 mois
Leclercq, Anne-Lise ULg; Kern, Sophie; Magis, David ULg et al

in ANAE : Approche Neuropsychologique des Apprentissages chez l'Enfant (2015), 135

Early screening of language delay is a challenge, especially in multilingual populations. The present research has led to the development of a rapid, large-scale screening tool for the Belgian Birth and ... [more ▼]

Early screening of language delay is a challenge, especially in multilingual populations. The present research has led to the development of a rapid, large-scale screening tool for the Belgian Birth and Childhood Agency, based on the performances on communicative development inventories of 683 children aged from 12 to 24 month. [less ▲]

Detailed reference viewed: 236 (16 ULg)
Full Text
Peer Reviewed
See detailAudiometric results after stapedotomy operations in patients with otosclerosis and preoperative small air-bone gaps
SALMON, Caroline ULg; BARRIAT, Sébastien ULg; DEMANEZ, Laurent CH P ULg et al

in Audiology & Neuro-otology (2015), 20

Objectives: The efficacy of stapedotomies performed on patients with small air-bone gaps (<25 dB, sABG) was compared to the efficacy of the operation in patients who had otosclerosis with high air-bone ... [more ▼]

Objectives: The efficacy of stapedotomies performed on patients with small air-bone gaps (<25 dB, sABG) was compared to the efficacy of the operation in patients who had otosclerosis with high air-bone gaps (≥25 dB, hABG). Methods: This retrospective study evaluates the short-term postoperative air and bone conduction thresholds and air-bone gaps after 181 CO2 laser stapedotomies. Results: A significantly smaller air-bone gap (ABG) and lower air conduction thresholds after surgery were observed in the group of patients who underwent surgery with preoperative ABGs of less than 25 dB. Bone conduction thresholds improve in sABG group after surgery. Conclusions: The results after stapedotomies are good even if the preoperative air-bone gap is small and that the overall risk of hearing deterioration due to stapes surgery remains low. [less ▲]

Detailed reference viewed: 118 (18 ULg)
Full Text
Peer Reviewed
See detailLayman versus Professional Musician: Who Makes the Better Judge?
Larrouy, Pauline ULg; Magis, David ULg; Grabenhorst, Matthias et al

in PLoS ONE (2015)

The increasing number of casting shows and talent contests in the media over the past years suggests a public interest in rating the quality of vocal performances. In many of these formats, laymen ... [more ▼]

The increasing number of casting shows and talent contests in the media over the past years suggests a public interest in rating the quality of vocal performances. In many of these formats, laymen alongside music experts act as judges. Whereas experts' judgments are considered objective and reliable when it comes to evaluating singing voice, little is known about laymen’s ability to evaluate peers. On the one hand, layman listeners–who by definition did not have any formal training or regular musical practice–are known to have internalized the musical rules on which singing accuracy is based. On the other hand, lay- man listeners’ judgment of their own vocal skills is highly inaccurate. Also, when compared with that of music experts, their level of competence in pitch perception has proven limited. The present study investigates laypersons' ability to objectively evaluate melodies per- formed by untrained singers. For this purpose, laymen listeners were asked to judge sung melodies. The results were compared with those of music experts who had performed the same task in a previous study. Interestingly, the findings show a high objectivity and reliabil- ity in layman listeners. Whereas both the laymen's and experts' definition of pitch accuracy overlap, differences regarding the musical criteria employed in the rating task were evident. The findings suggest that the effect of expertise is circumscribed and limited and supports the view that laypersons make trustworthy judges when evaluating the pitch accuracy of untrained singers. [less ▲]

Detailed reference viewed: 55 (12 ULg)
Full Text
Peer Reviewed
See detailL’utilisation du facteur de Bayes pour identifier les étudiants qui répondent au hasard
Béland, Sébastien; Raîche, Gilles; Magis, David ULg

in Revue des Sciences de l'Education (2015), 41

Les méthodes permettant de détecter les réponses au hasard dans l’évaluation des apprentissages présentent quelques limites. Par exemple, les indices de détection de patrons de réponses inappropriés ... [more ▼]

Les méthodes permettant de détecter les réponses au hasard dans l’évaluation des apprentissages présentent quelques limites. Par exemple, les indices de détection de patrons de réponses inappropriés (person-fit indexes) nécessitent généralement de grandes bases de données et permettent seulement de dire si un étudiant répond en accord ou non avec un modèle de mesure (par exemple, le modèle de Rasch). Dans le cadre de cet article, nous présentons une nouvelle approche permettant d’identifier les étudiants qui répondent au hasard lors d'épreuves d'évaluation des apprentissages. Après avoir discuté des limites des principales approches existantes, nous exposons les détails techniques de l'utilisation du facteur de Bayes pour évaluer un nombre fini d'hypothèses informatives. Ensuite, nous appliquons le facteur de Bayes à des données simulées et des données réelles obtenues à des fins d’illustration. Les résultats permettent de voir que le facteur de Bayes est une méthode prometteuse pour détecter le comportement de réponse au hasard. [less ▲]

Detailed reference viewed: 23 (0 ULg)
Peer Reviewed
See detailComputerized adaptive testing
Magis, David ULg; Mahalingam, Vaishali

in Da Silva, Marjorie Cristina Rocha; Bartholomeu, Daniel; Vendramini, Claudette Maria Medeiros (Eds.) et al Applicações de Métodos Estatísticos Avançadaos à Avaliação Psicológica e Educacional (2015)

Detailed reference viewed: 16 (0 ULg)
Peer Reviewed
See detailComputer adaptive testing using Concerto
Mahalingam, Vaishali; Magis, David ULg

in Vendramini, Claudette Maria Medeiros; Montiel, José Maria; Da Silva, Marjorie Cristina Rocha (Eds.) et al Applicações de Métodos Estatísticos Avançadaos à Avaliação Psicológica e Educacional (2015)

Detailed reference viewed: 30 (0 ULg)
Full Text
Peer Reviewed
See detailDetection of differential item functioning using the lasso approach
Magis, David ULg; Tuerlinckx, Francis; De Boeck, Paul

in Journal of Educational & Behavioral Statistics (2015), 40

This paper proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the “LR ... [more ▼]

This paper proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the “LR lasso DIF method”: logistic regression (LR) model is formulated for all item responses. The model contains item specific intercepts, an effect of the sum score and item-group interaction (i.e. DIF) effects, with a lasso penalty on all DIF parameters. Optimal penalty parameter selection is investigated through several known information criteria (AIC, BIC and cross-validation) as well as through a newly developed alternative. A simulation study was conducted to compare the global performance of the suggested “LR lasso DIF” method to the logistic regression and Mantel-Haenszel methods (in terms of false alarm and hit rates). It is concluded that for small samples the LR lasso DIF approach globally outperforms the logistic regression method, and also the Mantel-Haenszel method, especially in the presence of item impact, while it yields similar results with larger samples. [less ▲]

Detailed reference viewed: 78 (3 ULg)
Full Text
Peer Reviewed
See detailA note on weighted likelihood and Jeffreys modal estimation of proficiency levels in polytomous item response models
Magis, David ULg

in Psychometrika (2015), 80

Warm (1989) established the equivalence between the so-called Jeffreys modal and the weighted likelihood estimators of proficiency level with some dichotomous item response models. The purpose of this ... [more ▼]

Warm (1989) established the equivalence between the so-called Jeffreys modal and the weighted likelihood estimators of proficiency level with some dichotomous item response models. The purpose of this note is to extend this result to polytomous item response models. First, a general condition is derived to ensure the perfect equivalence between these two estimators. Second, it is shown that this condition is fulfilled by two broad classes of polytomous models including, among others, the partial credit, rating scale, graded response and nominal response models. [less ▲]

Detailed reference viewed: 46 (7 ULg)
Full Text
Peer Reviewed
See detailA note on the equivalence between observed and expected information functions with polytomous IRT models
Magis, David ULg

in Journal of Educational & Behavioral Statistics (2015), 40

The purpose of this note is to study the equivalence of observed and expected (Fisher) information functions with polytomous item response theory (IRT) models. It is established that observed and expected ... [more ▼]

The purpose of this note is to study the equivalence of observed and expected (Fisher) information functions with polytomous item response theory (IRT) models. It is established that observed and expected information functions are equivalent for the class of divide-by-total models (including partial credit, generalized partial credit, rating scale and nominal response models), but not for the class of difference models (including the graded response and modified graded response models). Yet, observed information function remains positive in both classes. Straightforward connections with dichotomous IRT models and further implications are outlined. [less ▲]

Detailed reference viewed: 22 (3 ULg)
Full Text
See detailPsychometrics and bibliometrics: Overview of future possible interactions
Magis, David ULg

Scientific conference (2014, December 02)

Bibliometrics is a growing research field with the main aim of collecting bibliometric information (citations, publications, awards...) to infer valuable rankings or to predict future academic performance ... [more ▼]

Bibliometrics is a growing research field with the main aim of collecting bibliometric information (citations, publications, awards...) to infer valuable rankings or to predict future academic performance. However, most bibliometric approaches are based either on pairwise comparisons of "objects" (authors, funding applications, academics...) with respect to selected "criteria" (publications, citation rate, h-index...), or on modeling observable academic outcomes using observed such criteria. The purpose of this talk is twofold. First, an improvement of the pairwise comparison approach will be outlined on the basis of so-called multicriteria decision aid (MCDA) routines, which will imply an increased flexibility and added freedom for the decision process. Second, we will start a deeper reflexion on possible connections between bibliometrics and psychometrics. Among others, the possibility of modeling latent "academic ability" as outcome of interest, given manifest criteria, will be put to an open discussion and exchange with the listeners [less ▲]

Detailed reference viewed: 52 (1 ULg)
Full Text
Peer Reviewed
See detailThe sentence repetition task: A powerful diagnostic tool for French children with specific language impairment
Leclercq, Anne-Lise ULg; Quémart, Pauline; Magis, David ULg et al

in Research in Developmental Disabilities (2014), 35

This study assesses the diagnostic accuracy and construct validity of a sentence repetition task that is commonly used for the identification of French children with specific language impairment (SLI ... [more ▼]

This study assesses the diagnostic accuracy and construct validity of a sentence repetition task that is commonly used for the identification of French children with specific language impairment (SLI). Thirty-four school-aged children with a confirmed, diagnostically based diagnosis of SLI, and 34 control children matched on age and nonverbal abilities performed the sentence repetition task. Two general scoring measures took into account the verbatim repetition of the sentence and the number of words accurately repeated. Moreover, five other scoring measures were applied to their answers in order to separately take into account their respect of lexical items, functional items, syntax, verb morphology, and the general meaning of the sentence. Results show good to high levels of sensitivity and specificity at the three cut-off points for all scoring measures. A principal component analysis revealed two factors. Scoring measures for the respect of functional words, syntax and verb morphology provided the largest loadings to the first factor, while scoring measures for the respect of lexical words and general semantics provided the largest loadings to the second factor. Sentence repetition appears to be a valuable tool to identify SLI in French children, and the ability to repeat sentences correctly is supported by two factors: a morphosyntactic factor and a lexical factor. [less ▲]

Detailed reference viewed: 182 (23 ULg)
See detailA lasso penalization approach to differential item functioning
Magis, David ULg

Conference (2014, November 24)

Detailed reference viewed: 14 (2 ULg)
Full Text
Peer Reviewed
See detailLes habitudes de sommeil chez l'enfant: indices de psychopathologie?
SCHOLL, Jean-Marc ULg; PHILIPPE, Paule ULg; Magis, David ULg

Poster (2014, November 21)

Objectif : investiguer chez l’enfant les habitudes d’endormissement en fonction de l’âge et de la présence ou non de psychopathologie(s) pour tester 2 hypothèses : 1) L’évolution développementale des ... [more ▼]

Objectif : investiguer chez l’enfant les habitudes d’endormissement en fonction de l’âge et de la présence ou non de psychopathologie(s) pour tester 2 hypothèses : 1) L’évolution développementale des habitudes de sommeil est différente et plus lente chez les enfants présentant une psychopathologie par rapport aux enfants « typiques ». 2) Les difficultés à l’endormissement sont plus fréquentes chez les enfants présentant une psychopathologie Méthodes : 15 questions simples concernant des habitudes de sommeil ont été posées aux parents de 2 groupes d’enfants âgés de 2.6 à 13 ans: 827 enfants « typiques » et 298 enfants « atypiques » suivis dans des consultations psychologiques ambulatoires. Le traitement statistique des données permet une étude développementale qui compare les habitudes de sommeil entre les 2 groupes ainsi que leur évolution en fonction de l’âge de l’enfant dans chaque groupe. Des courbes de percentiles pour chaque réponse nominale (toujours, souvent, parfois, rarement, jamais) ont été calculées en fonction de l’âge et du groupe. Résultats : Les analyses des réponses aux 15 questions: - 14 montrent des différences statistiquement significatives entre les 2 groupes (couche-tard/ durée d’endormissement/ appelle, sort de sa chambre, recherche de la compagnie/ s’occupe avec des jeux dans sa chambre avant de dormir/ exprime le souhait de s’endormir avec un frère, une sœur, un parent…) ; - 12 montrent des variations significatives des réponses avec l’âge de l’enfant ; - Pour 5 questions, les résultats montrent que l’effet de l’âge est différent suivant les groupes alors qu’inversement, pour 2 questions, l’effet de l’âge est identique dans les 2 groupes. Conclusion : Les résultats obtenus valident de façon très significative nos 2 hypothèses : 1) L’évolution développementale des habitudes de sommeil est différente et plus lente chez les enfants avec psychopathologie. 2) Les difficultés à l’endormissement sont plus fréquentes dans ce même groupe d’enfants. Nous pouvons en conclure qu’investiguer les habitudes de sommeil chez l’enfant grâce à des questions simples peut fournir des indices de psychopathologie et se révèle d’un très grand intérêt dans la clinique pédopsychiatrique. [less ▲]

Detailed reference viewed: 77 (9 ULg)