TCLUST: Trimming Approach of Robust Clustering Method

Authors

  • Muhamad Alias Md. Jedi
  • Robiah Adnan

DOI:

https://doi.org/10.11113/mjfas.v8n4.154

Keywords:

TCLUST, Trimmed k-means, Number of Group, Strength of Group-assignments,

Abstract

TCLUST is a method in statistical clustering technique which is based on modification of trimmed k-means clustering algorithm. It is called “crisp” clustering approach because the observation is can be eliminated or assigned to a group. TCLUST strengthen the group assignment by putting constraint to the cluster scatter matrix. The emphasis in this paper is to restrict on the eigenvalues, λ of the scatter matrix. The idea of imposing constraints is to maximize the log-likelihood function of spurious-outlier model. A review of different robust clustering approach is presented as a comparison to TCLUST methods. This paper will discuss the nature of TCLUST algorithm and how to determine the number of cluster or group properly and measure the strength of group assignment. At the end of this paper, R-package on TCLUST implement the types of scatter restriction, making the algorithm to be more flexible for choosing the number of clusters and the trimming proportion.

References

Fritz, et.al. A Fast Algorithm for Robust Constrained Clustering, University of Valladolid, Spain, preprint available at http://www.eio.uva.es/infor/personas/tclust_algorithm.pdf. 2011 [2] Fritz, et.al. TCLUST: An R Package for a Trimming Approach to Cluster Analysis, Preprint available at http://cran.r-project.org/web/packages/tclust/vignettes/tclust.pdf, May 4, 2011

Garcia et.al. A Review of Robust Clustering Methods, Advances in Data Analysis and Classification, 4(2-3), 89-109.

Garcia et.al. Exporing the Number of Groups in Robust Model-Based Clustering, University of Valladolid, Spain, preprint available at http://www.eio.uva.es/infor/personas/langel.html, 2011

Garcia et.al. Robust Properties of k-means and Trimmed k-means, J Am Stat Assoc, 1999, 94:956-969.

Garcia et.al. Trimming Tools in Exploratory Data Analysis, J Comput Graph Stat, 2003, 12:434-449

Hathaway, R.J. A Constrained formulation of maximum likelihood estimator for normal mixture distributions, Ann. Statist, 13, 795-800. 1985

Scott. et.al. Custering based on likelihood ratio criteria, Biometrics, 27, 387-397. 1971

Downloads

Published

16-07-2014