Downloads: 0 | Views: 47
Research Paper | Computational Linguistics | India | Volume 13 Issue 11, November 2024 | Popularity: 4.3 / 10
Parts of Speech (POS) Tagging in Telugu Corpora Using CRF Algorithm
Rajula Valaraju
Abstract: The study of NLP (Natural Language Processing), a branch of computer science and AI (Artificial Intelligence), enables machines to comprehend human language effectively and assist with linguistic tasks. The initial step in every NLP task is POS (Parts of Speech) tagging, which assigns a tag to a word based on its meaning and context. The present paper discusses parts of speech tagging (POS) in Telugu using Conditional Random Fields (CRF), a sequence modelling algorithm that is particularly effective in identifying entities or text patterns, such as POS tags, in highly inflectional and agglutinative languages like Telugu. Telugu is a highly inflectional and agglutinative language widely spoken in the southern part of India (mainly Andhra Pradesh and Telangana). The Language belongs to the Dravidian Family and, it follows the S - O - V structure. Compared to other machine learning algorithms, CRF has been proven more effective in overcoming label - bias problems in a language. In order to understand the language features and to tag the test corpus, an annotated corpus of 62, 996 words and a tag set of 18 tags is used for the study. The present study has achieved an accuracy of 80.17%.
Keywords: POS tagging, CRF Model, BIS Tag set, Telugu Language
Edition: Volume 13 Issue 11, November 2024
Pages: 188 - 190
DOI: https://www.doi.org/10.21275/SR241102123024
Make Sure to Disable the Pop-Up Blocker of Web Browser
Downloads: 129 | Views: 197
Computational Linguistics, India, Volume 1 Issue 3, December 2012
Pages: 163 - 167Isolated Spoken Word Identification in Malayalam using Mel-frequency Cepstral Coefficients and K-means clustering
Sreejith C, Reghuraj P C
Downloads: 67 | Views: 196
Computational Linguistics, India, Volume 9 Issue 10, October 2020
Pages: 1664 - 1669Aspect Based Sentiment Analysis for Users Review Dataset Using Deep Learning and BERT
Karan Arora, Sarthak Arora
Downloads: 45 | Views: 150
Computational Linguistics, India, Volume 10 Issue 3, March 2021
Pages: 185 - 188Comparison of Various Models in the Context of Language Identification (Indo Aryan Languages)
Salman Alam
Downloads: 2 | Views: 53
Computational Linguistics, India, Volume 13 Issue 11, November 2024
Pages: 367 - 371A Comprehensive Review of Sentiment Analysis: From Rule-Based Methods to Deep Learning and Future Directions
N John Kuotsu