International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 3 | Views: 160

Survey Paper | Computer Engineering | India | Volume 10 Issue 11, November 2021 | Popularity: 4.5 / 10


     

Investigation of Automatic Data Extraction Method from Complex Web Pages

Nitin More, Rupali A. Mangrule


Abstract: The Internet presents great deal of helpful info that is sometimes formatted for its users, that makes it laborious to extract relevant knowledge from numerous sources. Therefore, there's a big would like of strong, versatile info Extraction systems that remodel the net pages into program friendly structures like a computer database can become essential. The projected system focuses on info extraction from websites. We tend to cluster the net documents supported the common example structures so the example for every cluster is extracted at the same time. The planet wide net could be a huge and speedily growing supply of helpful info that is employed to publish and access the knowledge on the net. It uses totally different templates with contents for providing quick access for readers. This is often wont to extract info from example websites.


Keywords: Information Extraction, Clustering, Minimum Description Length Principle, MinHash, Template extraction, Clustering web pages


Edition: Volume 10 Issue 11, November 2021


Pages: 668 - 671



Make Sure to Disable the Pop-Up Blocker of Web Browser




Text copied to Clipboard!
Nitin More, Rupali A. Mangrule, "Investigation of Automatic Data Extraction Method from Complex Web Pages", International Journal of Science and Research (IJSR), Volume 10 Issue 11, November 2021, pp. 668-671, URL: https://www.ijsr.net/getabstract.php?paperid=SR211112161356, DOI: https://www.doi.org/10.21275/SR211112161356



Downloads: 354 | Views: 2002

Computer Engineering, India, Volume 9 Issue 1, January 2020

Pages: 381 - 386

Machine Learning Algorithms - A Review

Batta Mahesh


Downloads: 329 | Views: 584

Computer Engineering, India, Volume 9 Issue 5, May 2020

Pages: 597 - 602

Python Tools for Big Data Analytics

Lt Col Rahul Dutt Sharma


Downloads: 217 | Views: 411

Computer Engineering, India, Volume 9 Issue 3, March 2020

Pages: 488 - 491

Dog Breed Identification Using Convolution Neural Network and Web Scraping

Mohamed Sultan M, Naveen S, Praveen Kumar C, Arun Manicka Raja M


Downloads: 207 | Views: 395

Computer Engineering, Iraq, Volume 9 Issue 3, March 2020

Pages: 529 - 532

Implementation of Run Length Encoding Using Verilog HDL

Hayder Waleed Shnain, Mohammed Najm Abdullah, Hassan Awheed Jeiad


Downloads: 165 | Views: 356

Computer Engineering, Iraq, Volume 9 Issue 5, May 2020

Pages: 288 - 292

Deep Learning-based Deaf & Mute Gesture Translation System

Azher Atallah Fahad, Hassan Jaleel Hassan, Salma Hameedi Abdullah


Top