Reference : Breakdown points of the TCLUST procedure
Scientific conferences in universities or research centers : Scientific conference in universities or research centers
Physical, chemical, mathematical & earth Sciences : Mathematics
http://hdl.handle.net/2268/104212
Breakdown points of the TCLUST procedure
English
Ruwet, Christel mailto [Université de Liège - ULg > Département de mathématique > Statistique mathématique >]
22-Sep-2011
Seminario del departamento de Estadística e Inverstigación operativa de la UVa
Valladolid
Spain
[en] The TCLUST procedure is a new robust clustering method introduced by García-Escudero et al. (2008). It performs clustering with the aim of finding clusters with different scatters and weights. As the corresponding objective function can be unbounded, a restriction is added on the eigenvalues-ratio of the scatter matrices. The robustness of the method is guaranteed by allowing the trimming of a given proportion of observations. This trimming level has to be chosen by the practitioner, as well as the number of clusters. Suitable values for these parameters can be obtained throughout the careful examination of some classification trimmed likelihood curves (García-Escudero et al., 2010). The first part of this talk will consist of a brief presentation of this clustering procedure and the related R package (tclust).
In the second part of the talk, the robustness of the TCLUST procedure, and more precisely its breakdown behavior, will be studied. We will see that the estimator of the scatter matrices can resist to more outliers than the number of trimmed observations. However, the brekdown point of estimator of the centers is very poor. Two observations are sufficient to make the centers break down. This is due to the stringency of the classical breakdown point; the estimator has to have a good behavior even on samples which can hardly be clustered. For this reason, Gallegos and Ritter (2005) introduced the restricted breakdown point. The idea is to restrict the analysis to the class of “well-separated” data sets. On this class, the estimator of the centers has a breakdown point of α, the level of trimming.
http://hdl.handle.net/2268/104212

File(s) associated to this reference

Additional material(s):

File Commentary Size Access
Restricted access
Valladolid_September2011.pdf1.3 MBRequest copy

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.