International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 123 | Views: 242

Research Paper | Computer Science & Engineering | India | Volume 6 Issue 3, March 2017 | Popularity: 6.9 / 10


     

Data Balancing Scheme for Multi-node Heterogeneous Hadoop Cluster

Indresh B. Rajwade, Er. Prateek Singh


Abstract: Big data encompasses huge amount of information from multiple internal and external resources such as transactions, social media, enterprise content, sensors and mobile devices. It is characterized as volume, velocity, variety and veracity. MapReduce is a parallel computing framework which meets the tremendous needs for large scale data processing. Due to its simplicity, robustness and scalability MapReduce has been widely used by the companies such as Amazon, Facebook and Yahoo! to process large volumes of data on a daily basis. The MapReduce framework simplifies the complexity of running distributed data processing functions across multiple nodes in a cluster. It automatically handles the gathering of results across the multiple nodes and returns a single result or a set. Hadoop is an open source implementation of MapReduce which balances the load in a cluster by distributing data to multiple nodes based on disk space availability and processing efficiency. In this dissertation, the evaluation of data placement mechanism in a heterogeneous Hadoop cluster is performed using Grep tool and WordCount program. These are two MapReduce applications running on Hadoop clusters. A comparison has been done with Grep and WordCount through Ubuntu 14.04 LTS for three nodes in a Hadoop cluster and it is observed that the computing ratios of a Hadoop cluster are application dependent and size independent. This means that if the configuration of a cluster is updated, computing ratios must be determined again.


Keywords: Big Data, Hadoop, MapReduce, HDFS, Grep, WordCount, Heterogeneous cluster


Edition: Volume 6 Issue 3, March 2017


Pages: 2088 - 2094



Make Sure to Disable the Pop-Up Blocker of Web Browser




Text copied to Clipboard!
Indresh B. Rajwade, Er. Prateek Singh, "Data Balancing Scheme for Multi-node Heterogeneous Hadoop Cluster", International Journal of Science and Research (IJSR), Volume 6 Issue 3, March 2017, pp. 2088-2094, https://www.ijsr.net/getabstract.php?paperid=25031702, DOI: https://www.doi.org/10.21275/25031702



Similar Articles

Downloads: 1

Research Paper, Computer Science & Engineering, India, Volume 10 Issue 6, June 2021

Pages: 1188 - 1193

Profit Contribution of Bank Customer from Different Business Liabilities

Vinod Desai, Shalini B Ullagaddi, Vittal A Odeyar

Share this Article

Downloads: 1

Research Paper, Computer Science & Engineering, India, Volume 11 Issue 1, January 2022

Pages: 1229 - 1231

Big Data in Healthcare

Pratiksha Patil

Share this Article

Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Research Proposals or Synopsis, Computer Science & Engineering, India, Volume 11 Issue 9, September 2022

Pages: 837 - 842

An Optimized IoT-Enabled Big Data Analytics Architecture for Edge-Cloud Computing Using Deep Learning

Bharathi K.

Share this Article

Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Review Papers, Computer Science & Engineering, India, Volume 13 Issue 3, March 2024

Pages: 1036 - 1039

An Investigation of the Applications of Artificial Intelligence and Other New Technologies in Smart Energy Infrastructure

Karan Chawla

Share this Article

Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1

Informative Article, Computer Science & Engineering, India, Volume 9 Issue 9, September 2020

Pages: 1607 - 1610

Comprehensive Review on Automated Suspicious Activity Report Generation (SAR)

Ankur Mahida

Share this Article
Top