International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064




Downloads: 113 | Views: 219

Research Paper | Computer Science & Engineering | India | Volume 5 Issue 5, May 2016 | Rating: 7.1 / 10


Text Categorization using Jaccard Coefficient for Text Messages

Ankita Jadhao | Dr. A. J. Agrawal [3]


Abstract: There is wide growth in web application and electronic documents in day to day which needs automatic text classification of documents. Proper Classification methods provide the good results of the experiment and gives proper direction to the further processing of the text. The text is e-documents, news report, blogs, messages, comments on social media, e-books, web content etc which required text mining to extract meaningful knowledge from it. Some natural language techniques and machine learning algorithm are good to get the meaning of that e-document and classify them. There are lots of techniques are there for classification of the text documents, this paper is to understand different techniques and highlight the important methodology among them and helpful to selecting the classification technique which is appropriate to the text-classification process. And detail implementation of one of this method to classify the text message in two categories according the terms found in it. The coming text message is suspicious or not. In this case the Jaccard coefficient method gives the best result to classify message according to the words found in it. Text classification processes include several steps such as feature selection, vector representation and learning algorithm.


Keywords: Document Classification, Natural Language processing, Information retrieval, Text mining


Edition: Volume 5 Issue 5, May 2016,


Pages: 2046 - 2050


How to Download this Article?

Type Your Valid Email Address below to Receive the Article PDF Link


Verification Code will appear in 2 Seconds ... Wait

Top