Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop
International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 120 | Views: 291

M.Tech / M.E / PhD Thesis | Computer Science & Engineering | India | Volume 3 Issue 10, October 2014 | Popularity: 6.4 / 10


     

Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop

Y. K. Patil, Prof. V. S. Nandedkar


Abstract: Document clustering is one of the important areas in data mining. Hadoop is being used by the Yahoo, Google, Face book and Twitter business companies for implementing real time applications. Email, social media blog, movie review comments, books are used for document clustering. This paper focuses on the document clustering using Hadoop. Hadoop is the new technology used for parallel computing of documents. The computing time complexity in Hadoop for document clustering is less as compared to JAVA based implementations. In this paper, authors have proposed the design and implementation of Tf-Idf, K-means and Hierarchical clustering algorithms on Hadoop.


Keywords: Hadoop, Tf-Idf, Cosine Similarity, K-means and Hierarchical clustering


Edition: Volume 3 Issue 10, October 2014


Pages: 1566 - 1570



Make Sure to Disable the Pop-Up Blocker of Web Browser


Text copied to Clipboard!
Y. K. Patil, Prof. V. S. Nandedkar, "Design and Implementation of K-Means and Hierarchical Document Clustering on Hadoop", International Journal of Science and Research (IJSR), Volume 3 Issue 10, October 2014, pp. 1566-1570, https://www.ijsr.net/getabstract.php?paperid=OCT14526, DOI: https://www.doi.org/10.21275/OCT14526

Similar Articles

Downloads: 2 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Student Project, Computer Science & Engineering, India, Volume 11 Issue 5, May 2022

Pages: 650 - 654

Automatic Text Summarization and Audio Generation

Tanooja K, Tejasri K, Akhilesh T, Prasanna Kavya M

Share this Article

Downloads: 4 | Weekly Hits: ⮙1 | Monthly Hits: ⮙4

Research Paper, Computer Science & Engineering, Kazakhstan, Volume 13 Issue 11, November 2024

Pages: 1485 - 1488

Enhancing Recommendation Systems with Fuzzy Logic-Based Collaborative Filtering

Yernar Seitay

Share this Article

Downloads: 107 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Research Paper, Computer Science & Engineering, India, Volume 5 Issue 5, May 2016

Pages: 1964 - 1967

Improving Performance of Hindi-English based Cross Language Information Retrieval using Selective Documents Technique and Query Expansion

Aditi Agrawal, Dr. A. J. Agrawal

Share this Article

Downloads: 110

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 981 - 984

Using SVM and Stopword removal method in Microblogging Classroom

Vidya Dhuttargaon, Amit R. Sarkar

Share this Article

Downloads: 120 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

M.Tech / M.E / PhD Thesis, Computer Science & Engineering, India, Volume 5 Issue 6, June 2016

Pages: 2206 - 2210

Document Clustering using Improved K-means Algorithm

Anjali Vashist, Rajender Nath

Share this Article
Top