Downloads: 125 | Views: 270
Survey Paper | Computer Science & Engineering | India | Volume 5 Issue 12, December 2016 | Popularity: 6.9 / 10
A Survey on Different Duplicate Detection Methods
Tanvee Meshram, Nivedita Kadam
Abstract: Duplicate records availability is a common phenomenon in real world entities. These duplicate items are available in database because of multiple entries for the same data, incomplete data entries and errors during transactions. In todays world the data sets are very complex and removing the duplicates is a difficult task. Duplicate detection method helps to find out such cases where there are multiple entries for the same entity in real world. In most of the cases duplicate entries cause transactional errors and hence resulting into Operational and Strategic Decision making in an Organization and hence resulting into losses on monetary terms and Brand Image of the Organization. A given example may be multiple Aadhar Cards (Government Identification Cards in India) created for the same person through different locations and the data is used in different systems for identification purposes across industries and locations. The focus in this paper is to compare traditional duplicate detection methods Incremental Sorted Neighborhood Method (ISNM), Duplicate Count Strategy (DCS++) method, Progressive Sorted Neighborhood Method (PSNM) method and PPSNM (Parallel Progressive sorted neighborhood Method).
Keywords: PPSNM, Duplicate Detection, Map Reduce, Parallel Progressive sorted neighborhood Method
Edition: Volume 5 Issue 12, December 2016
Pages: 1222 - 1224
Make Sure to Disable the Pop-Up Blocker of Web Browser
Similar Articles
Downloads: 95 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Informative Article, Computer Science & Engineering, India, Volume 9 Issue 12, December 2020
Pages: 85 - 88CBCD Methods in Video Copy Detection
Jan Mary Thomas
Downloads: 103
Research Paper, Computer Science & Engineering, India, Volume 4 Issue 6, June 2015
Pages: 2676 - 2680Effective and Efficient XML Duplicate Detection Using Levenshtein Distance Algorithm
Shital Gaikwad, Nagaraju Bogiri
Downloads: 103
Survey Paper, Computer Science & Engineering, India, Volume 4 Issue 11, November 2015
Pages: 1666 - 1668Survey Paper on Cube Computation Techniques
Amar Sawant, Madhav Ingle
Downloads: 105
Research Paper, Computer Science & Engineering, India, Volume 3 Issue 11, November 2014
Pages: 2041 - 2044Optimization Technique for Efficient Dynamic Query Forms with NoSQL
Kavita Ozarkar, Rakesh Rajani
Downloads: 107
M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014
Pages: 1103 - 1108Design of a High Performing Cloud Using Load Rebalancing Technique in Distributed File System
Y. Steeven, C. Prakasha Rao