Map-Reduce
Map-Reduce is a data processing paradigm for condensing large volumes of data into aggregated results.
Note
The map-reduce operation is deprecated. The aggregation framework provides better performance and usability than map-reduce operations, and should be preferred for new development.
A map-reduce operation is issued on a collection view, as obtained from
Collection#find
method, by calling the map_reduce
method on the
view. The map_reduce
method takes three arguments: the mapper, the
reducer and map-reduce options. The mapper and the reducer must be provided
as strings containing JavaScript functions.
For example, given the following collection with values 1 through 10:
coll = client['foo'] 10.times do |i| coll.insert_one(v: i) end
The following invocation will sum up the values less than 6:
coll.find(v: {'$lt' => 6}).map_reduce( 'function() { emit(null, this.v) }', 'function(key, values) { return Array.sum(values) }', ).first['value'] # => 15.0
The map_reduce
method returns an instance of
Mongo::Collection::View::MapReduce
- a map-reduce view which holds
the parameters to be used for the operation. To execute the operation, either
iterate the results (by using e.g. each
, first
or to_a
on the
view object) or invoke the execute
method. The execute
method issues
the map-reduce operation but does not return the result set from the server,
and is primarily useful for when the output of the operation is directed to
a collection as follows:
coll.find(...).map_reduce(...).out('destination_collection').execute
Note that:
If the results of map-reduce are not directed to a collection, they are said to be retrieved inline. In this case the entire result set must fit in the 16 MiB BSON document size limit.
If the results of map-reduce are directed to a collection, and the map-reduce view is iterated, the driver automatically retrieves the entire collection and returns its contents as the result set. The collection is retrieved without sorting. If map-reduce is performed into a collection that is not empty, the driver will return the documents as they exist in the collection after the map-reduce operation completes, which may include the documents that were in the collection prior to the map-reduce operation.
coll.find(...).map_reduce(...).out('destination_collection').each do |doc| # ... end coll.find(...).map_reduce(...).out(replace: 'destination_collection', db: 'db_name').each do |doc| # ... end
Given a map-reduce view, it can be configured using the following methods:
Method | Description |
---|---|
js_mode | Sets the jsMode flag for the operation. |
out | Directs the output to the specified collection, instead of returning
the result set. |
scope | Sets the scope for the operation. |
verbose | Sets whether to include the timing information in the result. |
The following accessor methods are defined on the view object:
Method | Description |
---|---|
js_mode | Returns the current jsMode flag value. |
map_function | Returns the map function as a string. |
out | Returns the current output location for the operation. |
reduce_function | Returns the reduce function as a string. |
scope | Returns the current scope for the operation. |
verbose | Returns whether to include the timing information in the result. |