ANNOUNCEMENTVoyage AI joins MongoDB to power more accurate and trustworthy AI applications on Atlas. Learn more >Voyage AI joins MongoDB to power more accurate and trustworthy AI applications on Atlas. >>

NEWSLearn why MongoDB was named a leader in the 2024 Gartner® Magic Quadrant™ Read the report >Learn why MongoDB was named a leader in the 2024 Gartner® Magic Quadrant™ >>

NEWMongoDB 8.0: Experience unmatched speed and performance. Check it out >MongoDB 8.0: Experience unmatched speed and performance. >>

Atlas Stream Processing

ATLAS

Atlas Stream Processing

Simplify integrating MongoDB with Apache Kafka to build event-driven applications.

Get Started Now

Illustration of vectors going into and coming out of a pipe.

A data model built for streaming data

Schema management is critical to data correctness and developer productivity when working with streaming data. The document model gives developers a flexible, natural data model for building apps with real-time data.

A unified developer experience

Developers can use one platform—across API, query language, and data model—to continuously process streaming data from Apache Kafka alongside the critical application data stored in their databases.

Fully managed in Atlas

With a few lines of code, developers can quickly integrate streaming data from Apache Kafka with their database to build reactive, responsive applications—all fully managed with Atlas.

Native stream processing in MongoDB Atlas

View Documentation

Use Atlas Stream Processing to easily process and validate complex event data, merging it exactly where you need to use it.

View Documentation

Integrate with Apache Kafka data streams

Atlas Stream Processing makes querying data from Apache Kafka as easy as querying a MongoDB database. A stream processor is made up of a source stage, any number of processing stages, and a sink stage.

Read the documentation

Perform continuous analytics using window functions

Window operators in Atlas Stream Processing allow you to analyze and process specific, fixed-sized windows of data within a continuous data stream, making it easy to discover patterns and trends in near real-time.

Read the documentation

Validate schema on complex events

In Atlas Stream Processing, developers can perform continuous validation. Detecting potential message corruption and late-arriving data ensures that events are properly formed before processing.

Read the documentation

Integrate with Apache Kafka data streams

Read the documentation

MongoDB Query API

Code Snippet

xxxxxxxxxx
 
// define a source from the connection registryvar source = { $source: {connectionName: 'kafkaprod',topic: 'stocks'} }​// create some other stagesvar match = { $match: { 'exchange':'NYSE'} }​// create a sinkvar sink = { $merge: {​into: {connectionName: 'mongoprod',db: 'StockDB',col: 'TransactionHistory'​} }​// try it!var myProcessor = [source, match, sink]sp.process(myProcessor)

Read the Documentation

Perform continuous analytics using window functions

Read the documentation

MongoDB Query API

Code Snippet

xxxxxxxxxx
 
// define a tumbling window{   $tumblingWindow: {    interval: {      size: NumberInt(60), unit: 'second'},      pipeline: [{        $group: {          _id: '$ip_source',          count_connection_reset: { $sum: 1 }         }      }]    } },​​// output has projections for convenience{  _id: '127.0.0.1',  count_connection_reset: 60,  _stream_meta: {    sourceType: 'kafka',    windowStartTimestamp: 2023-05-18T17:07:00.000Z,    windowEndTimestamp: 2023-05-18T17:08:00.000Z  }}

Read the Documentation

Validate schema on complex events

In Atlas Stream Processing, developers can perform continuous validation. Detecting potential message corruption and late-arriving data ensures that events are properly formed before processing.

Read the documentation

MongoDB Query API

Code Snippet

{ $validate: {
        validator:  {
            $and: [
                    {$expr: {
                        $ne: [
                        '$device_id',
                        'device_8'
                        ]
                    }},
                    {$jsonSchema: {
                    required: [ 'device_id', 'timestamp', 'obs', 'event_type' ],
                    not : {required : ['event_details']},
                    properties: {
                        device_id: {
                            bsonType: 'string',
                            pattern: '^device_\\d+',
                            description: "'device_id' is required and must be like device_#"
                        },
                        obs: {
                            bsonType: 'object',
                            required: [ 'watts', 'temp' ],
                            properties:{
                                watts : {
                                    bsonType : 'int',
                                    minimum: 0,
                                    maximum: 250,
                                    description: "'obs.watts' is required and cannot be less then 0 or more then 250"
                                },
                                temp : {
                                        bsonType: 'int',
                                        description: "'obs.temp' must be an integer"
                                },
                                }
                            },
                            event_type : {
                                    bsonType: 'int',
                                    minimum: 0,
                                    maximum: 1,
                            },
                        timestamp: {
                            bsonType: 'string',
                            description: "'timestamp' must be a string "
                        }
                        }
                    }
                    }
            ]
                    },    validationAction : 'dlq'}
}

xxxxxxxxxx
 
{ $validate: {        validator:  {            $and: [                    {$expr: {                        $ne: [                        '$device_id',                        'device_8'                        ]                    }},                    {$jsonSchema: {                    required: [ 'device_id', 'timestamp', 'obs', 'event_type' ],                    not : {required : ['event_details']},                    properties: {                        device_id: {                            bsonType: 'string',                            pattern: '^device_\\d+',                            description: "'device_id' is required and must be like device_#"                        },                        obs: {                            bsonType: 'object',                            required: [ 'watts', 'temp' ],                            properties:{                                watts : {                                    bsonType : 'int',                                    minimum: 0,                                    maximum: 250,

MongoDB Query API

Code Snippet

​x
 
// define a source from the connection registry
var source = { $source: {
connectionName: 'kafkaprod',
topic: 'stocks'
} }
​
// create some other stages
var match = { $match: { 'exchange':'NYSE'} }
​
// create a sink
var sink = { $merge: {
​
into: {
connectionName: 'mongoprod',
db: 'StockDB',
col: 'TransactionHistory'
​
} }
​
// try it!
var myProcessor = [source, match, sink]
sp.process(myProcessor)

Read the Documentation

Atlas Stream Processing customer successes

View all customers

CONTINUOUS INSIGHTS

"At Acoustic, our key focus is to empower brands with behavioral insights that enable them to create engaging, personalized customer experiences. With Atlas Stream Processing, our engineers can leverage the skills they already have from working with data in Atlas to process new data continuously, ensuring our customers have access to real-time customer insights."

John Riewerts
EVP of Engineering, Acoustic

Learn More

Learning hub

Find white papers, tutorials, and videos about how to handle streaming data.

Get pro tips on building event-driven apps

Modern apps operate continuously and in real-time. Learn how to leverage MongoDB and Atlas Stream Processing best to bring digital experiences to life and accelerate time-to-insight.

Read the whitepaper

Follow an Atlas Stream Processing tutorial

Check out this step-by-step tutorial on setting up Atlas Stream Processing and running your first stream processor.

Get started

Watch a session on Atlas Stream Processing

Take a deep dive into Atlas Stream Processing, complete with an overview of how to use it, a live demo, and Q&A.

Watch the video

Stream processing use cases

View all use cases

MongoDB for Artificial Intelligence

Securely unify operational, unstructured, and streaming data to enable building AI-enriched applications.

Learn more

Retail

MongoDB for Retail Innovation

Build modern consumer experiences and make your data work for your business and your customers.

Learn more

Manufacturing

MongoDB for Manufacturing

Power end-to-end value chain optimization with AI/ML, analytics, and stream processing for innovative manufacturing applications.

Learn more

FAQ

What is streaming data?

Streaming data is generated continuously from a wide range of sources. IoT sensors, microservices, and mobile devices are all common sources of high-volume streams of data. The continuous nature of streaming data as well as its immutability make it unique from static data at rest in a database.

What is stream processing?

Stream processing is continuously ingesting and transforming event data from an event messaging platform (like Apache Kafka) to perform various functions. This could mean creating simple filters to remove unneeded data, performing aggregations to count or sum data as needed, creating stateful windows, and more. Stream processing can be a differentiating characteristic in event-driven applications, allowing for more reactive, responsive customer experiences.

Learn more

How is event streaming different from stream processing?

Streaming data lives inside of event streaming platforms (like Apache Kafka), and these systems are essentially an immutable distributed log. Event data is published and consumed from event streaming platforms using APIs.

Developers need to use a stream processor to perform more advanced processing, such as stateful aggregations, window operations, mutations, and creating materialized views. These are similar to the operations one does when running queries on a database, except that stream processing continuously queries an endless stream of data. This area of streaming is an emergent technology with solutions such as Apache Flink and Spark Streaming quickly gaining traction.

With Atlas Stream Processing, MongoDB provides developers with a better way to process streams for use in their applications while leveraging the aggregation framework.

Learn more

Why did MongoDB build Atlas Stream Processing?

Stream processing is an increasingly critical component for building responsive, event-driven applications. By adding stream processing functionality as a native capability in MongoDB Atlas, we're helping more developers build innovative applications leveraging our modern, multi-cloud database.

Learn more

How is stream processing different from batch processing?

Stream processing happens continuously. In the context of building event-driven applications, stream processing enables reactive and compelling experiences like real-time notifications, personalization, route planning, and predictive maintenance.

Batch processing does not work on continuously produced data. Instead, batch processing works by gathering data over a specified period of time and then processing that static data as needed. An example of batch processing is a retail business collecting sales at the close of business each day for reporting purposes or updating inventory levels.

Learn more

What’s the difference between a stream processing pipeline and an aggregation pipeline?

Atlas Stream Processing extends the aggregation pipeline with stages for processing continuous data streams. These stages combine with existing aggregation stages built into the default mongod process, enabling developers to perform many of the same operations on continuous data as they can perform on data at rest.

Read the documentation

Does Atlas Stream Processing support checkpointing?

Yes. Atlas Stream Processing uses checkpoint documents to capture the state of a stream processor once the last operator of a stream processor completes. This is an important capability for restarting a processor after failure, without requiring significant re-processing of data.

Read the documentation

How does Atlas Stream Processing help me handle data errors during stream processing?

Atlas Stream Processing supports the $validate operator in the MongoDB Query API. Using this operator, developers can ensure that data values and types conform to a schema, optionally routing data to a dead letter queue (DLQ) when it fails validation criteria.

See an example

What cloud providers does Atlas Stream Processing support?

As of March 2025, Atlas Stream Processing supports AWS and Azure in regions across the globe.

Read our documentation

Ready to get started?

Check out a tutorial to get started creating a stream processor today.

Get Started NowRegister now

GET STARTED TODAY

Easily integrate Kafka & MongoDB
Process data continuously
Native MongoDB experience
Available globally