15 Facts About MapReduce

1.

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

FactSnippet No. 1,574,009
2.

MapReduce program is composed of a map procedure, which performs filtering and sorting, and a reduce method, which performs a summary operation.

FactSnippet No. 1,574,010
3.

The key contributions of the MapReduce framework are not the actual map and reduce functions, but the scalability and fault-tolerance achieved for a variety of applications by optimizing the execution engine.

FactSnippet No. 1,574,011
4.

MapReduce libraries have been written in many programming languages, with different levels of optimization.

FactSnippet No. 1,574,012
5.

The name MapReduce originally referred to the proprietary Google technology, but has since been genericized.

FactSnippet No. 1,574,013

Related searches

Google World Wide Web
6.

MapReduce is a framework for processing parallelizable problems across large datasets using a large number of computers, collectively referred to as a cluster or a grid.

FactSnippet No. 1,574,014
7.

MapReduce can take advantage of the locality of data, processing it near the place it is stored in order to minimize communication overhead.

FactSnippet No. 1,574,015
8.

MapReduce allows for the distributed processing of the map and reduction operations.

FactSnippet No. 1,574,016
9.

Map and Reduce functions of MapReduce are both defined with respect to data structured in pairs.

FactSnippet No. 1,574,017
10.

The frozen spot of the MapReduce framework is a large distributed sort.

FactSnippet No. 1,574,018
11.

Communication cost often dominates the computation cost, and many MapReduce implementations are designed to write all communication to distributed storage for crash recovery.

FactSnippet No. 1,574,019
12.

MapReduce achieves reliability by parceling out a number of operations on the set of data to each node in the network.

FactSnippet No. 1,574,020
13.

MapReduce is useful in a wide range of applications, including distributed pattern-based searching, distributed sorting, web link-graph reversal, Singular Value Decomposition, web access log stats, inverted index construction, document clustering, machine learning, and statistical machine translation.

FactSnippet No. 1,574,021
14.

At Google, MapReduce was used to completely regenerate Google's index of the World Wide Web.

FactSnippet No. 1,574,022
15.

Jorgensen asserts that DeWitt and Stonebraker's entire analysis is groundless as MapReduce was never designed nor intended to be used as a database.

FactSnippet No. 1,574,023