TCLUST: Trimming Approach of Robust Clustering Method
DOI:
https://doi.org/10.11113/mjfas.v8n4.154Keywords:
TCLUST, Trimmed k-means, Number of Group, Strength of Group-assignments,Abstract
TCLUST is a method in statistical clustering technique which is based on modification of trimmed k-means clustering algorithm. It is called “crisp” clustering approach because the observation is can be eliminated or assigned to a group. TCLUST strengthen the group assignment by putting constraint to the cluster scatter matrix. The emphasis in this paper is to restrict on the eigenvalues, λ of the scatter matrix. The idea of imposing constraints is to maximize the log-likelihood function of spurious-outlier model. A review of different robust clustering approach is presented as a comparison to TCLUST methods. This paper will discuss the nature of TCLUST algorithm and how to determine the number of cluster or group properly and measure the strength of group assignment. At the end of this paper, R-package on TCLUST implement the types of scatter restriction, making the algorithm to be more flexible for choosing the number of clusters and the trimming proportion.References
Fritz, et.al. A Fast Algorithm for Robust Constrained Clustering, University of Valladolid, Spain, preprint available at http://www.eio.uva.es/infor/personas/tclust_algorithm.pdf. 2011 [2] Fritz, et.al. TCLUST: An R Package for a Trimming Approach to Cluster Analysis, Preprint available at http://cran.r-project.org/web/packages/tclust/vignettes/tclust.pdf, May 4, 2011
Garcia et.al. A Review of Robust Clustering Methods, Advances in Data Analysis and Classification, 4(2-3), 89-109.
Garcia et.al. Exporing the Number of Groups in Robust Model-Based Clustering, University of Valladolid, Spain, preprint available at http://www.eio.uva.es/infor/personas/langel.html, 2011
Garcia et.al. Robust Properties of k-means and Trimmed k-means, J Am Stat Assoc, 1999, 94:956-969.
Garcia et.al. Trimming Tools in Exploratory Data Analysis, J Comput Graph Stat, 2003, 12:434-449
Hathaway, R.J. A Constrained formulation of maximum likelihood estimator for normal mixture distributions, Ann. Statist, 13, 795-800. 1985
Scott. et.al. Custering based on likelihood ratio criteria, Biometrics, 27, 387-397. 1971