International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 112 | Views: 261

Review Papers | Computer Science & Engineering | India | Volume 4 Issue 4, April 2015 | Popularity: 6.3 / 10


     

A Review on Identifying the Main Content From Web Pages

Madhura R. Kaddu, Dr. R. B. Kulkarni


Abstract: A web page is a web document in which huge amount of information is available and because of rapid growth of World Wide Web there is a great advantage to anyone, the user can easily access the web pages from any place through the internet. In the web page contains noisy information like menus, footers, unnecessary links, logos, etc and the main content. Most of the users are interested in only main content. But the main problem with the extraction process is to greater performance impact on web summarization, question answering system, information retrieval application because of the web page is collection of noisy and main content. So we propose an extraction process for identifying main content from web pages. In the extraction process consist of an automatic extraction techniques and hand crafted rules. In the automatic extraction techniques process the first step is to the web page is segmented into web page block and the second step is to differentiate main content from irrelevant or noisy content. In the hand crafted rule process extracts the main content from web pages by using rules which are already generated.


Keywords: DOM Tree, Content extraction, Web mining, Machine learning method, Web page Segmentation


Edition: Volume 4 Issue 4, April 2015


Pages: 2630 - 2634



Make Sure to Disable the Pop-Up Blocker of Web Browser




Text copied to Clipboard!
Madhura R. Kaddu, Dr. R. B. Kulkarni, "A Review on Identifying the Main Content From Web Pages", International Journal of Science and Research (IJSR), Volume 4 Issue 4, April 2015, pp. 2630-2634, https://www.ijsr.net/getabstract.php?paperid=SUB153719, DOI: https://www.doi.org/10.21275/SUB153719



Similar Articles

Downloads: 2 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Research Paper, Computer Science & Engineering, India, Volume 12 Issue 6, June 2023

Pages: 1168 - 1174

A Machine Learning Approach for the Diagnosis of Chronic Kidney Disease

Divya Pogaku, Sneha Bohra

Share this Article

Downloads: 4

Comparative Studies, Computer Science & Engineering, India, Volume 10 Issue 6, June 2021

Pages: 1560 - 1562

A Comparative Study on Different Training Model in Machine Learning

Priyanka S Jigalur, Dr. B. G. Prasad

Share this Article

Downloads: 4 | Weekly Hits: ⮙2 | Monthly Hits: ⮙4

Research Paper, Computer Science & Engineering, India, Volume 13 Issue 8, August 2024

Pages: 779 - 789

Machine Learning-Based Detection of Synonymous IP Flood Attacks on Server Infrastructure

Surbhi Batra, Chandra Sekhar Dash

Share this Article

Downloads: 112

Review Papers, Computer Science & Engineering, India, Volume 4 Issue 4, April 2015

Pages: 1060 - 1064

Review on Cost Estimation Prediction Using ANN

Anshul, Nitin Jain

Share this Article

Downloads: 114

Research Paper, Computer Science & Engineering, India, Volume 3 Issue 12, December 2014

Pages: 1141 - 1146

Traffic Allocation Technique in Computer Networks

Malgireddy Saidi Reddy

Share this Article



Top