Downloads: 122 | Views: 357

Survey Paper | Computer Science & Engineering | India | Volume 3 Issue 10, October 2014 | Rating: 6.4 / 10

Survey Paper on Big Data Processing and Hadoop Components

Poonam S. Patil, Rajesh. N. Phursule

Abstract: As big data continues down its path of growth, a major challenge has become how to deal with the explosion of data and analysis of this data. For such data-intensive applications, the Apache Hadoop Framework has recently attracted a lot of attention. This framework Adopted MapReduce, it is an programming model and an associated implementation for processing and generating large data sets. Hadoop Provides: Distributed File System, Job scheduling, Resource Management Capabilities, and Java API for writing Application E. g. Java Map-Reduce, Streaming MapReduce, Crunch, Pig latin, Hive, Oozie etc. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, Hadoop gives the flexibility to use any language to write an algorithms. In this paper we will briefly introduce the MapReduce framework based on Hadoop and the current state-of-the-art in MapReduce algorithms for big data analysis.

Keywords: Big data, Hadoop, MapReduce, Hive, Hbase, Distributed Data, Relational Database, NoSql

Edition: Volume 3 Issue 10, October 2014,

Pages: 585 - 590

Survey Paper on Big Data Processing and Hadoop Components

Rate this Article