15 Facts About MapReduce

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

FactSnippet No. 1,574,009

MapReduce program is composed of a map procedure, which performs filtering and sorting, and a reduce method, which performs a summary operation.

FactSnippet No. 1,574,010

The key contributions of the MapReduce framework are not the actual map and reduce functions, but the scalability and fault-tolerance achieved for a variety of applications by optimizing the execution engine.

FactSnippet No. 1,574,011

MapReduce libraries have been written in many programming languages, with different levels of optimization.

FactSnippet No. 1,574,012

The name MapReduce originally referred to the proprietary Google technology, but has since been genericized.

FactSnippet No. 1,574,013

Related searches

Google World Wide Web

MapReduce is a framework for processing parallelizable problems across large datasets using a large number of computers, collectively referred to as a cluster or a grid.

FactSnippet No. 1,574,014

MapReduce can take advantage of the locality of data, processing it near the place it is stored in order to minimize communication overhead.

FactSnippet No. 1,574,015

MapReduce allows for the distributed processing of the map and reduction operations.

FactSnippet No. 1,574,016

Map and Reduce functions of MapReduce are both defined with respect to data structured in pairs.

FactSnippet No. 1,574,017

10.

The frozen spot of the MapReduce framework is a large distributed sort.

FactSnippet No. 1,574,018

11.

Communication cost often dominates the computation cost, and many MapReduce implementations are designed to write all communication to distributed storage for crash recovery.

FactSnippet No. 1,574,019

12.

MapReduce achieves reliability by parceling out a number of operations on the set of data to each node in the network.

FactSnippet No. 1,574,020

13.

MapReduce is useful in a wide range of applications, including distributed pattern-based searching, distributed sorting, web link-graph reversal, Singular Value Decomposition, web access log stats, inverted index construction, document clustering, machine learning, and statistical machine translation.

FactSnippet No. 1,574,021

14.

At Google, MapReduce was used to completely regenerate Google's index of the World Wide Web.

FactSnippet No. 1,574,022

15.

Jorgensen asserts that DeWitt and Stonebraker's entire analysis is groundless as MapReduce was never designed nor intended to be used as a database.

FactSnippet No. 1,574,023