Explain how the Map Reduce frameworks like Hadoop exploit locality to achieve scalable parallelism.

Question

Question

Explain how the Map Reduce frameworks like Hadoop exploit locality to achieve scalable parallelism.

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

Map reduce is a general framework that allows us to write applications that can process large amounts of data running in parallel.This technique involves concept of distributed programming by using Java.However Hadoop can run MapReduce programs in various languages like ruby,python and c++.

The MapReduce model include data processing primitives called mappers and reducers.The main advantage of this is that it allows data processing over multiple computing nodes.

How MapReduce works?
1. It divides the work in 2 tasks-
a)Map Function- It breaks down a set of data into tuples(Value/Pairs)and convert it into another set of data.
b)Reduce Function-This task is always performed after the Map has completed its work. It take MAP's output as it's input and combines those tuples into smaller set of tuples.
2.The execution of MapReduce is controlled by following-
a)Jobtracker-Also known as Master
b)Multiple Task trackers- Also known as Slaves.
3.A single job is divided into multiple tasks running on multiple computer nodes.The job tracker role is to schedule the activities over various nodes and coordinate those activities.Successful execution of task is then ensured by Multiple task trackers.These task trackers then sends a progress report to the Job tracker after execution of each tasks and ensures overall progress of the job.If an event fails,the job tracker can reschedule a news task on a different task tracker.

Add a comment

Answer 2

Explain how the Map Reduce frameworks like Hadoop exploit locality to achieve scalable parallelism.

Homework Answers

Add Answer to:
Explain how the Map Reduce frameworks like Hadoop exploit locality to achieve scalable parallelism.

Post as a guest

Earn Coins

MapReduce and Hadoop (a) Explain the difference between map and reduce tasks in the MapReduce framework. (b) How does the Hadoop framework ensure that no reduce tasks can begin until all map tasks hav...

Explain how YARN extends Hadoop to enable multiple frameworks such as MapReduce, Giraph, Spark, and Flink....

How can systems utilize the principle of “locality” to increase performance? What is a “multi-banked” cache,...

Hadoop's default locality-first scheduling mainly targets map tasks. Why is it NOT useful to schedule reduce...

Can you explain how to formulate map units in these types of questions? I have a...

Explain how the Map Reduce frameworks like Hadoop exploit locality to achieve scalable parallelism.

Homework Answers

Add Answer to: Explain how the Map Reduce frameworks like Hadoop exploit locality to achieve scalable parallelism.

Post as a guest

Earn Coins

MapReduce and Hadoop (a) Explain the difference between map and reduce tasks in the MapReduce framework. (b) How does the Hadoop framework ensure that no reduce tasks can begin until all map tasks hav...

Explain how YARN extends Hadoop to enable multiple frameworks such as MapReduce, Giraph, Spark, and Flink....

How can systems utilize the principle of “locality” to increase performance? What is a “multi-banked” cache,...

Hadoop's default locality-first scheduling mainly targets map tasks. Why is it NOT useful to schedule reduce...

Can you explain how to formulate map units in these types of questions? I have a...

Add Answer to:
Explain how the Map Reduce frameworks like Hadoop exploit locality to achieve scalable parallelism.