References of "Magis, David"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailA note on weighted likelihood and Jeffreys modal estimation of proficiency levels in polytomous item response models
Magis, David ULg

in Psychometrika (in press)

Warm (1989) established the equivalence between the so-called Jeffreys modal and the weighted likelihood estimators of proficiency level with some dichotomous item response models. The purpose of this ... [more ▼]

Warm (1989) established the equivalence between the so-called Jeffreys modal and the weighted likelihood estimators of proficiency level with some dichotomous item response models. The purpose of this note is to extend this result to polytomous item response models. First, a general condition is derived to ensure the perfect equivalence between these two estimators. Second, it is shown that this condition is fulfilled by two broad classes of polytomous models including, among others, the partial credit, rating scale, graded response and nominal response models. [less ▲]

Detailed reference viewed: 14 (4 ULg)
Full Text
Peer Reviewed
See detailAccuracy of asymptotic standard errors of the maximum and weighted likelihood estimators of proficiency levels with short tests
Magis, David ULg

in Applied Psychological Measurement (in press)

The maximum likelihood (ML) and the weighted likelihood (WL) estimators are commonly used to obtain proficiency level estimates with pre-calibrated item parameters. Both estimators have the same ... [more ▼]

The maximum likelihood (ML) and the weighted likelihood (WL) estimators are commonly used to obtain proficiency level estimates with pre-calibrated item parameters. Both estimators have the same asymptotic standard error (ASE) that can be easily derived from the expected information function of the test. However, the accuracy of this asymptotic formula is uncertain with short tests when only a few items are administered. The purpose of this paper is to compare the ASE of these estimators to their exact values, evaluated at the proficiency level estimates. The exact SE is computed by generating the full exact sample distribution of the estimators, so its practical feasibility is limited to small tests (except under the Rasch model). A simulation study was conducted to compare the ASE and the exact SE of the ML and WL estimators, to the “true” SE (i.e., computed as the exact SE with the true proficiency levels). It is concluded that with small tests, the exact SEs are less biased and return smaller root mean squared error values than the asymptotic SEs, while as expected the two estimators return similar results with longer tests. [less ▲]

Detailed reference viewed: 10 (3 ULg)
Full Text
Peer Reviewed
See detailThe effect of melody and technique on the singing voice accuracy of trained singers
Larrouy, Pauline ULg; Magis, David ULg; Morsomme, Dominique ULg

in Logopedics, Phoniatrics, Vocology (in press)

A previous study highlighted the effect of vocal technique on the singing voice accuracy of trained singers (1). The intervals' precision between the notes of the tune was altered when the singers used ... [more ▼]

A previous study highlighted the effect of vocal technique on the singing voice accuracy of trained singers (1). The intervals' precision between the notes of the tune was altered when the singers used Western operatic singing technique. In order to better understand these results, we have recorded two different melodies sung with two different vocal techniques. A large panel of trained singers (N = 50) participated in the study. The analytical method described in the reference paper (1) has been applied. The results confirm the effect of vocal technique on the vocal accuracy of trained singers. In addition, these results provide an answer about the melodic effect and guide future work on the perception process of operatic voices. [less ▲]

Detailed reference viewed: 28 (7 ULg)
Full Text
Peer Reviewed
See detailSnijders’ correction of Infit and Outfit indexes with estimated ability level : an analysis with the Rasch model
Magis, David ULg; Béland, Sébastien; Raîche, Gilles

in Journal of Applied Measurement (in press)

The Infit mean square W and the Outfit mean square U are commonly used person fit indexes under Rasch measurement. However, they suffer from two major weaknesses. First, their asymptotic distribution is ... [more ▼]

The Infit mean square W and the Outfit mean square U are commonly used person fit indexes under Rasch measurement. However, they suffer from two major weaknesses. First, their asymptotic distribution is usually derived by assuming that the true ability levels are known. Second, such distributions are even not clearly stated for indexes U and W. Both issues can seriously affect the selection of an appropriate cut-score for person fit identification. Snijders (2001) proposed a general approach to correct some person fit indexes when specific ability estimators are used. The purpose of this paper is to adapt this approach to U and W indexes. First, a brief sketch of the methodology and its application to U and W is proposed. Then, the corrected indexes are compared to their classical versions through a simulation study. The suggested correction yields controlled Type I errors against both conservatism and inflation, while the power to detect specific misfitting response patterns gets significantly increased. [less ▲]

Detailed reference viewed: 16 (0 ULg)
Full Text
Peer Reviewed
See detailEstimation des paramètres d’item et de sujet à partir du modèle de Rasch : une étude comparative des logiciels BILOG-MG, ICL et R
Béland, Sébastien; Magis, David ULg; Raîche, Gilles

in Mesure et Evaluation en Education [=MEE] (in press)

La théorie de la réponse aux items (TRI) est une classe de modèles de mesure très utilisée en éducation. À ce jour, de nombreux logiciels, tel BILOG-MG, sont disponibles afin de procéder à l'estimation ... [more ▼]

La théorie de la réponse aux items (TRI) est une classe de modèles de mesure très utilisée en éducation. À ce jour, de nombreux logiciels, tel BILOG-MG, sont disponibles afin de procéder à l'estimation des paramètres d'item et de sujet. Parmi ces logiciels, il ne faut pas négliger ICL et R, qui sont gratuits et qui peuvent permettre de produire des analyses diversifiées. Cette étude a pour objectif de comparer la qualité d’estimation des paramètres selon une des modélisations issues de la TRI : le modèle de Rasch. Pour ce faire, nous comparons les estimateurs du paramètre de difficulté et de sujet selon trois logiciels : BILOG-MG, ICL et la librairie ltm disponible dans le logiciel R. Nous procédons à une analyse par simulation informatique et, dans un second temps, nous analysons un test de classement en anglais, langue seconde. Les résultats démontrent que les logiciels étudiés permettent d’obtenir des estimateurs des paramètres similaires, la différence principale entre ces logiciels étant leur temps d’exécution des procédures d’estimation. [less ▲]

Detailed reference viewed: 41 (2 ULg)
Full Text
Peer Reviewed
See detailPassage de l’administration fixe d’un test à une administration adaptative : application au TCALS-II
Magis, David ULg; Raîche, Gilles

in Raîche, Gilles; Ndinga, Pascal; Meunier, Hélène (Eds.) L'interdisciplinarité de la mesure et de l’évaluation (in press)

La problématique du passage de l’administration fixe (de type papier-crayon) à une administration adaptative d’un test est étudiée. Une méthode en deux étapes est présentée. Dans un premier temps, des ... [more ▼]

La problématique du passage de l’administration fixe (de type papier-crayon) à une administration adaptative d’un test est étudiée. Une méthode en deux étapes est présentée. Dans un premier temps, des patrons de réponses sont générés selon une administration fixe, dans le but de déterminer des valeurs admissibles de l’erreur-type d’estimation du niveau d’habileté. Ensuite, ces valeurs sont utilisées comme critères d’arrêt lors d’une administration adaptative du même test. La longueur du test est alors considérée pour évaluer la qualité du test par rapport à sa version fixe. Le test de classement en anglais, langue seconde, au collégial (TCALS-II) est utilisé en guise d’illustration. Il est établi qu’une administration adaptative du TCALS-II permettrait de réduire sensiblement la longueur du test, sans perte de qualité de l’estimation des niveaux d’habileté. Toutefois, cette amélioration est limitée aux sujets ne présentant pas un niveau d’habileté trop faible ou trop important. [less ▲]

Detailed reference viewed: 20 (5 ULg)
Peer Reviewed
See detailA small overview of available computer software to support computerized adaptive testing
Magis, David ULg

Conference (2013, August 27)

Computerized adaptive testing (CAT) is becoming a central tool for testing and assessment. It offers many advantages over fixed (“paper-and-pencil”) methods, such as individualized assessment, reduction ... [more ▼]

Computerized adaptive testing (CAT) is becoming a central tool for testing and assessment. It offers many advantages over fixed (“paper-and-pencil”) methods, such as individualized assessment, reduction of fraud, and straightforward estimation of proficiency levels. CAT has been studied for decades and remains an up-to-date research field in psychometrics and educational science. Practical CAT administration, however, is less frequently considered in such studies. Assigning CAT to respondents requires both the sufficient availability of computer machines, and the use of a powerful and easy-to-use CAT software. With the fast increase of computer resources at moderate cost, the availability of computer machines is becoming a less central, yet important, issue in the practical assessment of CAT tests. The choice of an accurate CAT software, on the other hand, should be guided by its flexibility, its underlying statistical modeling, and its user-friendly potential. According to the type of research or data analysis, some CAT software might be preferred to another. It is therefore important for the researcher or the clinician to know about the current availability of such software, in line with current research and practice in the CAT framework. Moreover, these software should allow enough flexibility to incorporate updates and new theoretical developments, such as e.g., new rules for next item selection. This talk proposes a simple and user-oriented presentation of several CAT software that are currently available. The software to be presented are: the Firestar software (Choi, 2009), the R package catR (Magis & Raîche, 2012), the R package catIrt (Nydick, 2012) and the CAT web-platform Concerto (Kosinski & Rust, 2011). The first three are non-commercial software, while Concerto is a web interface between end users (willing to develop computerized assessment tests) and catR (as underlying routine software). Both R packages are written to be most useful for researchers, without end-user interface, and are therefore less appealing for applied researchers who are not familiar with R. Yet, they offer flexible solutions by means of many options to optimize the design of the test and generate many response patterns for further analyses. Also, they can be easily integrated as sub-routines for more sophisticated CAT software. Firestar provides a user interface and makes all necessary computations with underlying R code. This talk aims at focusing on freely available CAT software. For this reason, only the four aforementioned programs will be presented, although it exists other, commercial CAT software such as e.g., the CATSim software (Assessment Systems Corporation, 2012). The different CAT software are briefly presented and their advantages and drawbacks, flexibility and usefulness are compared, mostly from the point of view of the applied researcher and clinician. The following criteria were retained for objective comparison: (a) their main goal of application; (b) the type of data and IRT modeling they can deal with; (c) the type of users they are focusing on; (d) their operating options; (e) their availability and flexibility for further improvements. A small demonstration of the R package catR will be proposed optionally, depending on time limitation. References Assessment Systems Corporation (2012). CATSim: Comprehensive simulation of computerized adaptive testing. St. Paul, MN. URL: http://www.assess.com/. Choi , S. W. (2009). Firestar: Computerized adaptive testing simulation program for polytomous item response theory models. Applied Psychological Measurement, 33, 644-645. Kosinski, M., & Rust, J. (2011). The development of Concerto: An open source online adaptive testing platform. Paper presented at the International Association for Computerized and Adaptive Testing (IACAT), Pacific Grove, CA. Magis, D., & Raîche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48, 1-31. Nydick, S. W. (2012). catIrt: An R package for simulating IRT-based computerized adaptive tests. R package version 0.3-0. [less ▲]

Detailed reference viewed: 92 (0 ULg)
Peer Reviewed
See detailEquivalence of weighted likelihood and Jeffreys modal estimation of proficiency under polytomous item response models
Magis, David ULg

Conference (2013, July 23)

This talk focuses on two proficiency level estimators in item response theory (IRT) framework: the weighted likelihood estimator (WLE) and the Jeffreys modal estimator (JME), that is, the usual Bayes ... [more ▼]

This talk focuses on two proficiency level estimators in item response theory (IRT) framework: the weighted likelihood estimator (WLE) and the Jeffreys modal estimator (JME), that is, the usual Bayes modal estimator with Jeffreys’ non-informative prior. With dichotomously scored items, the WLE and the JME are completely equivalent under the two-parameter logistic model, while remarkable relationships were established under the three-parameter logistic model. The purpose of this talk is to extend such comparison to polytomously scored items. It is shown that both WLE and JME are also equivalent for two broad classes of polytomous IRT models, including, among others, the (modified) graded response model, the (generalized) partial credit model, the rating scale model and the nominal response model. Parallelisms with dichotomously scored items are drawn. An example from a real data set is used to illustrate this finding. [less ▲]

Detailed reference viewed: 6 (0 ULg)
Peer Reviewed
See detailProposition de nouveaux indices de détection de patrons de réponses inappropriés dans le contexte des enquêtes et des épreuves d’évaluation des apprentissages
Béland, Sébastien; Raîche, Gilles; Magis, David ULg

Conference (2013, June 07)

Il n’est pas rare de voir des étudiants répondre de façon inappropriée à une épreuve d’évaluation comportant des items à réponses choisies. Par exemple, certains individus peuvent tricher alors que ... [more ▼]

Il n’est pas rare de voir des étudiants répondre de façon inappropriée à une épreuve d’évaluation comportant des items à réponses choisies. Par exemple, certains individus peuvent tricher alors que d’autres peuvent tenter de se sous-classer intentionnellement à un examen. Plusieurs approches ont été développées pour faire la détection de ce type d’étudiants. À ce jour, l’approche la plus prometteuse est l’utilisation d’indice de détection de type person‐fit (Meijer et Sijtsma, 2001). L’indice de détection lz (Drasgow, Levine et Williams, 1985) est fort probablement le plus utilisé et le plus connu d’entre tous. Malheureusement, cet indice est fortement affecté par le fait que l’habileté des étudiants est estimée et non pas réelle ; ce qui peut biaiser son calcul (Molenaar et Hoijtink, 1990). Pour pallier ce problème, Snijders (2001) a proposé une correction qui permet de diminuer considérablement le biais associé à la moyenne et à la variance de l’indice lz. Dans le cadre de notre projet doctoral, nous nous inspirerons de la suggestion de Snijders (2001) pour corriger deux autres indices de détection de patrons de réponses inappropriés : l’infit mean square (u) et l’outfit mean square (w). À cette fin, nous utiliserons une approche monte carliste afin d’investiguer plus en détails l’erreur de type 1 et la puissance de ces autres indices. Nos résultats préliminaires, que nous présenterons lors de cette communication, montrent que ces autres indices corrigés semblent eux aussi plus efficaces que leur version traditionnelle sans correction. [less ▲]

Detailed reference viewed: 4 (0 ULg)
Full Text
See detailIntroduction to the R software
Magis, David ULg

Scientific conference (2013, May 08)

The R software (R Core Development Team, 2012) is an open-source statistical software that allows data handling, statistical analyses and model fitting, , and graphical representations, among others. It ... [more ▼]

The R software (R Core Development Team, 2012) is an open-source statistical software that allows data handling, statistical analyses and model fitting, , and graphical representations, among others. It is very flexible and has lots of pre-installed statistical methods. It is working under all operating systems, including Windows, Linux/UNIX and MacOS. The R community is worldwide and proposes free exchanges of shared R packages through the CRAN (comprehensive R archive network). However, the user needs some practice to become familiar with the software, as it does not have easy to use interface, yet. The purpose of this workshop is to illustrate some aspects of this software with applied purposes. A data set from the field of clinical psychology will be considered throughout the workshop as an illustrative example. Data loading in R, data manipulation, summary and simple statistics, graphics, basic (t-tests, ANOVA, …) and advanced (factor analysis, generalized linear modeling, item response theory, …) statistical analyses will be described and illustrated. Live demonstrations will be run and participants will be encouraged to practice during the workshop. Participants are required to bring their own laptops, preferably with R already installed (technical assistance will be provided before the workshop to help participants in installing R if necessary). The workshop will be mostly Windows users oriented. The illustrative data set will also be available for participants. Useful references and links: 1) The R software website: http://www.r-project.org 2) The Use R! series of Springer books in general, and more precisely: Zuur, A. F., Ieno, E. N., & Meesters, E. H. W. G. (2009). A beginner’s guide to R. New York: Springer. [less ▲]

Detailed reference viewed: 4 (1 ULg)
Full Text
Peer Reviewed
See detailModèles polytomiques issus de la théorie de la réponse à l’item
Raîche, Gilles; Béland, Sébastien; Magis, David ULg

Conference (2013, May 06)

Les items qui composent les échelles de mesure en éducation et en sciences humaines sont fréquemment associées à plus de deux choix de réponses. Il s'agit alors d'items à réponses polytomiques. Plusieurs ... [more ▼]

Les items qui composent les échelles de mesure en éducation et en sciences humaines sont fréquemment associées à plus de deux choix de réponses. Il s'agit alors d'items à réponses polytomiques. Plusieurs modélisations issues de la théorie de la réponse à l'item ont été proposées pour calibrer de telles échelles de mesures. Lorsque les choix de réponses ne sont pas ordonnés, les modèles à réponses nominales peuvent être utilisés. Lorsque ces choix sont ordonnés, les modèles gradués, à crédit partiel ou à appréciation peuvent être appliqués. Cette communication a pour objectif de présenter ces différentes modélisations et d'identifier des solutions logicielles pour effectuer la calibration. La présentation sera effectuée à l'aide d'un exemple auquel les différentes modélisations seront appliquées. Il sera ainsi possible de comparer les paramètres d'items et de personnes obtenus à l'aide de chacune des modélisations retenues. [less ▲]

Detailed reference viewed: 6 (2 ULg)
Full Text
See detailApplication of lasso penalization to differential item functioning detection
Magis, David ULg

Scientific conference (2013, February 26)

Identification of differential item functioning (DIF) in dichotomously scored items is often performed item by item. This approach increases the risk of false discovery errors (Type I error rate) as all ... [more ▼]

Identification of differential item functioning (DIF) in dichotomously scored items is often performed item by item. This approach increases the risk of false discovery errors (Type I error rate) as all items other than the tested one are assumed to be free of DIF. Some ad-hoc procedures, such as item purification and alpha level adjustment for multiple comparisons, have been studied in this context. The purpose of this talk is to focus on a different approach based on penalized likelihood estimation of a look-alike IRT model. Specifically, a Rasch model is being introduced with item-group interaction terms (i.e. DIF effects). Rather than obtaining pointwise estimates of the interaction parameters, which may be impossible because of high collinearity effects, the DIF effects are estimated with a lasso penalty term. Several criteria for optimally selecting the lasso tuning parameter are discussed, including cross-validation, AIC, BIC, and variants of these criteria. Preliminary results of a simulation study are presented and discussed. [less ▲]

Detailed reference viewed: 20 (2 ULg)
Full Text
Peer Reviewed
See detailRandom generation of dichotomous CAT response patterns with the R package catR
Magis, David ULg

Conference (2013, February 14)

The purpose of this talk is to briefly introduce the R package catR that permits random generation of response patterns under a computerized adaptive testing (CAT) framework. First, an outline of the CAT ... [more ▼]

The purpose of this talk is to briefly introduce the R package catR that permits random generation of response patterns under a computerized adaptive testing (CAT) framework. First, an outline of the CAT is proposed, with emphasis on the main concepts (item bank, ability estimation, next item selection, stopping rule, item exposure and content balancing). Then, the performance of the catR package is described by making connections between the general CAT framework and the functionalities of the R functions within catR. An example will be displayed, either as a “live” demonstration of catR or as part of the talk. Potential extensions of catR will also be discussed. The catR package was jointly developed by Gilles Raîche (Université du Québec à Montréal, Canada). [less ▲]

Detailed reference viewed: 8 (0 ULg)
Full Text
See detailAn overview of statistical methods to assess differential item functioning and differential test functioning
Magis, David ULg; Monseur, Christian ULg

Scientific conference (2013, February 12)

This talk broadly focuses on the identification of differential item functioning (DIF) and differential test functioning (DTF). After a short introduction of the key concepts, most-known methods to detect ... [more ▼]

This talk broadly focuses on the identification of differential item functioning (DIF) and differential test functioning (DTF). After a short introduction of the key concepts, most-known methods to detect DIF and DTF with dichotomously or polytomously scored items, and between two or more than two groups, are presented. Both parametric (i.e. IRT) and nonparametric (i.e. score-based) methods are described in a non-technical way. Several potential applications to PISA surveys are discussed. [less ▲]

Detailed reference viewed: 25 (1 ULg)
See detailOn the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models
Magis, David ULg

Report (2013)

In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods ... [more ▼]

In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods were introduced to lessen the impact of such aberrant responses onto the estimation process. The computation of asymptotic (i.e., large sample) standard errors (ASE) for these robust estimators, however, has not been fully considered yet. This paper focuses on a broad class of robust ability estimators, defined by an appropriate selection of the weight function and the residual measure, for which the ASE is derived from the theory of estimating equations. The maximum likelihood (ML) and the robust estimators, together with their estimated ASE, are then compared through a simulation study. It is concluded that both the estimators and their ASE perform similarly in absence of response disturbances, while the robust estimator and its estimated ASE are less biased and outperform their ML counterparts in presence of response disturbances with large impact on the item response process. [less ▲]

Detailed reference viewed: 9 (2 ULg)
Full Text
Peer Reviewed
See detailA note on the item information function of the four-parameter logistic model
Magis, David ULg

in Applied Psychological Measurement (2013), 37

This paper focuses on four-parameter logistic (4PL) model (Barton & Lord, 1981) as an extension of the usual three-parameter logistic (3PL) model with an upper asymptote possibly different from one. For a ... [more ▼]

This paper focuses on four-parameter logistic (4PL) model (Barton & Lord, 1981) as an extension of the usual three-parameter logistic (3PL) model with an upper asymptote possibly different from one. For a given item with fixed item parameters, Lord (1980) derived the value of the latent ability level that maximizes the item information function under the 3PL model. The purpose of this paper is to extend this result to the 4PL model. A generic and algebraic method is developed for that purpose. The result is practically illustrated by an example and several potential applications of this result are outlined. [less ▲]

Detailed reference viewed: 38 (4 ULg)
Full Text
Peer Reviewed
See detailÉvaluation d’un test de lecture en anglais par deux méthodes de détection du fonctionnement différentiel d’items
Pichette, François; Raîche, Gilles; Béland, Sébastien et al

in Revue des Sciences de l'Education (2013), 37

Cette étude vise à examiner la présence de fonctionnement différentiel d’items selon le sexe des répondants dans un test de compréhension en lecture en anglais administré à 171 universitaires francophones ... [more ▼]

Cette étude vise à examiner la présence de fonctionnement différentiel d’items selon le sexe des répondants dans un test de compréhension en lecture en anglais administré à 171 universitaires francophones. Deux méthodes non paramétriques sont utilisées: le test Mantel-Haenszel et le modèle de régression logistique. Sur un total de 64 items, deux présentent un fonctionnement différentiel selon le test Mantel-Haenszel, alors que cinq items supplémentaires ressortent par la régression logistique. Ce faible nombre d’items suggère une bonne équité du test, mais les différences observées soulignent la nécessité d’analyses additionnelles pour clarifier le statut de ces items. [less ▲]

Detailed reference viewed: 29 (1 ULg)
Full Text
Peer Reviewed
See detailItem purification does not always improve DIF detection: a counter-example with Angoff’s Delta plot
Magis, David ULg; Facon, Bruno

in Educational & Psychological Measurement (2013), 73

Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score based DIF detection methods ... [more ▼]

Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores in order to get purified sets of items, unaffected by DIF. The purpose of this paper is to highlight that item purification is not always useful and that a single run of the DIF method may return equally suitable results. Angoff’s Delta plot is considered as a counter-example DIF method, with a recent improvement to the derivation of the classification threshold. Several possible item purification processes may be defined with this method, and all of them are compared through a simulation study and a real data set analysis. It appears that none of these purification processes clearly improves the Delta plot performance. A tentative explanation is drawn from the conceptual difference between the modified Delta plot and the other traditional DIF methods. [less ▲]

Detailed reference viewed: 9 (1 ULg)
Full Text
Peer Reviewed
See detailNon-graphical solutions to Cattell's scree test
Raîche, Gilles; Walls, Ted; Magis, David ULg et al

in Methodology: European Journal of Research Methods for the Behavioral and Social Sciences (2013), 9

Most of the strategies that have been proposed to determine the number of components that account for the most variation in a principal components analysis of a correlation matrix rely on the analysis of ... [more ▼]

Most of the strategies that have been proposed to determine the number of components that account for the most variation in a principal components analysis of a correlation matrix rely on the analysis of the eigenvalues and on numerical solutions. The Cattell’s scree test is a graphical strategy with a nonnumerical solution to determine the number of components to retain. Like Kaiser’s rule, this test is one of the most frequently used strategies for determining the number of components to retain. However, the graphical nature of the scree test does not definitively establish the number of components to retain. To circumvent this issue, some numerical solutions are proposed, one in the spirit of Cattell’s work and dealing with the scree part of the eigenvalues plot, and one focusing on the elbow part of this plot. A simulation study compares the efficiency of these solutions to those of other previously proposed methods. Extensions to factor analysis are possible and may be particularly useful with many low-dimensional components. [less ▲]

Detailed reference viewed: 63 (5 ULg)
Full Text
Peer Reviewed
See detailImpact de la méthode d'estimation du niveau d'habileté et du choix des premiers items sur l'efficacité de l'administration adaptative du TCALS II
Magis, David ULg; Raîche, Gilles

in Raîche, Gilles; Ndinga, Pascal; Meunier, Hélène (Eds.) Des mécanismes pour assurer la validité de l'interprétation de la mesure en éducation. Tome 3: : aspects pratiques (2013)

Le TCALS II (test de classement en anglais, langue seconde, au collégial, 2e version) est un questionnaire constitué d’un nombre fixe de 85 items administré aux étudiants de la province du Québec qui ... [more ▼]

Le TCALS II (test de classement en anglais, langue seconde, au collégial, 2e version) est un questionnaire constitué d’un nombre fixe de 85 items administré aux étudiants de la province du Québec qui entament des études au niveau collégial. Une version adaptative informatisée de ce test est envisagée pour la première fois dans cette étude. Deux problématiques sont regardées de plus près : le choix d’une méthode d’estimation du niveau d’habileté optimale et la sélection des premiers items du test. Ces deux problématiques sont étudiées simultanément par le biais de simulation Monte-Carlo à partir de plusieurs règles d’arrêt liées à la longueur du test. On en conclut que le choix des premiers items affecte peu l’estimation du niveau d’habileté, tandis que des différences notoires apparaissent toutefois entre les quatre méthodes d’estimation comparées. Certaines conclusions et recommandations sont dressées pour la poursuite ultérieure de ces travaux. [less ▲]

Detailed reference viewed: 8 (0 ULg)