Downloads: 137 | Views: 391
Research Paper | Computer Science & Engineering | India | Volume 3 Issue 3, March 2014 | Popularity: 6.7 / 10
An Efficient Divergence and Distribution Based Similarity Measure for Clustering Of Uncertain Data
Geetha, Mary Shyla
Abstract: Data Mining is the extraction of hidden predictive information from large databases. Clustering is one of the popular data mining techniques. Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modeling similarity between uncertain objects and developing efficient computational methods. The previous methods extend traditional partitioning clustering methods. Such methods cannot handle uncertain objects that are geometrically indistinguishable, such as products with the same mean but very different variances in customer ratings. Surprisingly, probability distributions, which are essential characteristics of uncertain objects, have not been considered in measuring similarity between uncertain objects. In Existing method to use the well-known Kullback-Leibler divergence to measure similarity between uncertain objects in both the continuous and discrete cases, and integrate it into partitioning and density-based clustering methods to cluster uncertain objects. It is very costly or even infeasible. The proposed work introduces the well-known Kernel skew divergence to measure similarity between uncertain objects in both the continuous and discrete cases. Measuring the cluster similarity with Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space and to further speed up the computation.
Keywords: Clustering, uncertain data, Kernel skew Divergence and distribution
Edition: Volume 3 Issue 3, March 2014
Pages: 333 - 339
Please Disable the Pop-Up Blocker of Web Browser
Verification Code will appear in 2 Seconds ... Wait