Docs Menu
Docs Home
/
Atlas Charts
/ /

Backing Aggregation Pipeline

On this page

  • Example
  • Encoding
  • Adding Queries
  • Adding Calculated Fields
  • Adding Filters

To get the data needed to render a chart, Charts creates a MongoDB Aggregation Pipeline and runs the pipeline on the MongoDB database server. The pipeline consists of multiple stages, each of which is generated based on different settings specified by the chart's author.

This document explains how the various Chart Builder settings are used to construct the Aggregation Pipeline. You can view the pipeline used to create a chart by choosing the View Aggregation Pipeline option in the Chart Builder's ellipsis dropdown on the top right.

Atlas Charts construct a pipeline that consists of the following segments in the following order:

  1. Charts views

  2. Pre-filters in embedding charts

  3. Dashboard filters

  4. Chart query

  5. Lookup fields

  6. Calculated fields

  7. Embedding filters, including injected filters

  8. Chart filters

  9. Encoded fields

  10. Sorting

  11. Maximum document limit

Note

When creating a chart, you can configure some but not all of the previous chart segments. When Charts generates the aggregation pipeline, it skips unspecified segments.

The following chart shows the total sale amounts from an office supply company, categorized by purchase method. Each document in the data collection represents a single sale.

Using this chart as an example, we will explore how the specifications for each of the above settings change the aggregation pipeline generated by Atlas Charts.

Aggregation pipline example without any filters.
click to enlarge

Without any Data Source pipeline, Query bar queries, calculated fields, and filters added in the Filter pane, Atlas Charts generates the following aggregation pipeline:

1{
2 "$addFields": { // Encoding
3 "__alias_0": {
4 "$sum": "$items.price"
5 }
6 }
7},
8{
9 "$group": {
10 "_id": {
11 "__alias_1": "$purchaseMethod"
12 },
13 "__alias_0": {
14 "$sum": "$__alias_0"
15 }
16 }
17},
18{
19 "$project": {
20 "_id": 0,
21 "__alias_1": "$_id.__alias_1",
22 "__alias_0": 1
23 }
24},
25{
26 "$project": {
27 "x": "$__alias_1",
28 "y": "$__alias_0",
29 "_id": 0
30 }
31},
32
33{
34 "$addFields": { // Sorting
35 "__agg_sum": {
36 "$sum": [
37 "$y"
38 ]
39 }
40 }
41},
42{
43 "$sort": {
44 "__agg_sum": -1
45 }
46},
47{
48 "$project": {
49 "__agg_sum": 0
50 }
51},
52{
53 "$limit": 5000
54}

The pipeline at this point consists of groups from the Encode panel, stages for the default sort order, and the maximum document limit, which is set to 5000 by Atlas Charts.

The query below restricts the documents shown to only those with a saleDate equal to or more recent than January 1, 2017 with at least 5 elements in the items array. items is an array where each element is an item purchased during a sale.

Query:

{
$and: [
{
saleDate: { $gte: new Date("2017-01-01") }
},
{
'items.4': { $exists: true }
} ]
}

Applying the above query in the Query bar generates the following chart and aggregation pipeline:

Aggregation pipline example with query.
click to enlarge

Aggregation Pipeline:

1{
2 "$match": { // Query
3 "$and": [
4 {
5 "saleDate": {
6 "$gte": {
7 "$date": "2017-01-01T00:00:00Z"
8 }
9 }
10 },
11 {
12 "items.4": {
13 "$exists": true
14 }
15 }
16 ]
17 }
18},
19{
20 "$addFields": {
21 "__alias_0": {
22 "$sum": "$items.price"
23 }
24 }
25},
26{
27 "$group": {
28 "_id": {
29 "__alias_1": "$purchaseMethod"
30 },
31 "__alias_0": {
32 "$sum": "$__alias_0"
33 }
34 }
35},
36{
37 "$project": {
38 "_id": 0,
39 "__alias_1": "$_id.__alias_1",
40 "__alias_0": 1
41 }
42},
43{
44 "$project": {
45 "x": "$__alias_1",
46 "y": "$__alias_0",
47 "_id": 0
48 }
49},
50{
51 "$addFields": {
52 "__agg_sum": {
53 "$sum": [
54 "$y"
55 ]
56 }
57 }
58},
59{
60 "$sort": {
61 "__agg_sum": -1
62 }
63},
64{
65 "$project": {
66 "__agg_sum": 0
67 }
68},
69{
70 "$limit": 5000
71}

The aggregation pipeline now starts with the query applied, and is followed by the groups selected in the Encode panel and the max document limit.

We can also change the chart to show the total revenue generated categorized by purchase method. To accomplish this task, we will create a calculated field that calculates the total revenue by multiplying price by quantity. Adding this new calculated field, in addition to the query above, produces the following chart and pipeline:

Calculated Field Expression:

Aggregation pipline example with calculated field.
click to enlarge

Aggregation Pipeline:

1{
2 "$match": {
3 "$and": [
4 {
5 "saleDate": {
6 "$gte": {
7 "$date": "2017-01-01T00:00:00Z"
8 }
9 }
10 },
11 {
12 "items.4": {
13 "$exists": true
14 }
15 }
16 ]
17 }
18},
19{
20 "$addFields": { // Calculated Field
21 "revenue": {
22 "$reduce": {
23 "input": "$items",
24 "initialValue": 0,
25 "in": {
26 "$sum": [
27 "$$value",
28 {
29 "$multiply": [
30 "$$this.price",
31 "$$this.quantity"
32 ]
33 }
34 ]
35 }
36 }
37 }
38 }
39},
40{
41 "$group": {
42 "_id": {
43 "__alias_0": "$purchaseMethod"
44 },
45 "__alias_1": {
46 "$sum": "$revenue"
47 }
48 }
49},
50{
51 "$project": {
52 "_id": 0,
53 "__alias_0": "$_id.__alias_0",
54 "__alias_1": 1
55 }
56},
57{
58 "$project": {
59 "x": "$__alias_0",
60 "y": "$__alias_1",
61 "_id": 0
62 }
63},
64{
65 "$addFields": {
66 "__agg_sum": {
67 "$sum": [
68 "$y"
69 ]
70 }
71 }
72},
73{
74 "$sort": {
75 "__agg_sum": -1
76 }
77},
78{
79 "$project": {
80 "__agg_sum": 0
81 }
82},
83{
84 "$limit": 5000
85}

The updated pipeline now includes the calculated field right below the query applied in the Query bar while the order of the rest of the components remains unchanged.

This chart can be further refined by adding a filter in the Filter pane to only select in-store sales made in the New York location. Adding this filter produces the following chart and aggregation pipeline:

Aggregation pipline example with filter.
click to enlarge

Aggregation Pipeline:

1{
2 "$match": {
3 "$and": [
4 {
5 "saleDate": {
6 "$gte": {
7 "$date": "2017-01-01T00:00:00Z"
8 }
9 }
10 },
11 {
12 "items.4": {
13 "$exists": true
14 }
15 }
16 ]
17 }
18},
19{
20 "$addFields": {
21 "revenue": {
22 "$reduce": {
23 "input": "$items",
24 "initialValue": 0,
25 "in": {
26 "$sum": [
27 "$$value",
28 {
29 "$multiply": [
30 "$$this.price",
31 "$$this.quantity"
32 ]
33 }
34 ]
35 }
36 }
37 }
38 }
39},
40{
41 "$match": { // Filter
42 "storeLocation": {
43 "$in": [
44 "New York"
45 ]
46 }
47 }
48},
49{
50 "$group": {
51 "_id": {
52 "__alias_0": "$purchaseMethod"
53 },
54 "__alias_1": {
55 "$sum": "$revenue"
56 }
57 }
58},
59{
60 "$project": {
61 "_id": 0,
62 "__alias_0": "$_id.__alias_0",
63 "__alias_1": 1
64 }
65},
66{
67 "$project": {
68 "x": "$__alias_0",
69 "y": "$__alias_1",
70 "_id": 0
71 }
72},
73{
74 "$addFields": {
75 "__agg_sum": {
76 "$sum": [
77 "$y"
78 ]
79 }
80 }
81},
82{
83 "$sort": {
84 "__agg_sum": -1
85 }
86},
87{
88 "$project": {
89 "__agg_sum": 0
90 }
91},
92{
93 "$limit": 5000
94}

The pipeline now includes the storeLocation filter right below the calculated field while the order of the rest of the components remains unchanged.

Back

Subset Mode