Listening to changes in nested objects

Arvid_Norstrom · 2024-09-23T18:51:24.266Z

Let us assume I have a document that looks like this:
{
“_id”: {
“$oid”: “123”
},
“stock_prices”: {
“AAPL”: {
“last_updated”: {
“$date”: “2024-09-23T11:58:58.480Z”
},
“price”: 227.3
},
“TSLA”: {
“last_updated”: {
“$date”: “2024-09-23T11:58:58.480Z”
},
“price”: 247.16
}
}
}

Now my goal is to listen to changes to the price field of any stock. I assume this is possible by specifying a regex in the pipeline. My current pipeline which consumes all changes looks like this:

pipeline = [
{
‘$match’: {
‘$or’: [
{‘updateDescription.updatedFields’:{
‘$exists’: True
}
}
]
}
}
]

How can I modify it to listen to changes in the price field of any stock? If it is not possible to listen to the changes of a generic stock, I would like to know if it is possible to listen to changes of a hardcoded stock.

steevej · 2024-09-24T14:36:11.965Z

Arvid_Norstrom:

How can I modify it to listen to changes in the price field of any stock?

There is a major issue with your model that would make a lot of things complicated. The problem is that you use data value as the field name for the tickers. This means that there is no common way to access the stock prices or last_updated sub-fields. For example, for Apple you need to access the field stock_prices.AAPL.price and for Tesla you need stock_prices.TSLA.price.

Indexing the stock prices or last_updated fields is impossible or very hard to do.

Look at the attribute pattern and update your schema to something like:

{
    "stock_prices" :
    [
        {
            "ticker" : "AAPL"  ,
            "last_updated" : date ,
            "price" : 227.3
        } ,
        {
            "ticker" : "TSLA" ,
            "last_updated" : date ,
            "price" : 247.16
        }
    ]
}

The field stock_prices is changed to an array and the stock tickers are changed from field name to field value. Look at $objectToArray for an easy migration.

Please also read code snippet so that your documents and code are easier to work with.

Arvid_Norstrom · 2024-09-25T06:06:51.117Z

Thanks for replying so quickly!
Actually, using the data value as field name is a deliberate choice, in order to quickly access the correct ticker when updating its price. Thus, instead of iterating over a list (which might be quite long), I am able to find the appropriate sub-field in constant time.
I do concede that it’s anti-pattern, but performance is paramount in this case.
I need to store all the tickers in the same document, or the equivalent to tickers in my use case. I expect around a hundred tickers per document and thousands of updates per second with as little delay as possible.

I don’t think there is a way to use arrays and get constant or even logarithmic time complexity for the query, since indexes are on a per collection level and not on a per document level. I would be happy to be proven wrong though.

Sorry about the formatting, I can’t find a way to edit my original post, but I’ll keep it in mind for next time.

steevej · 2024-09-25T14:41:36.402Z

If

Arvid_Norstrom:

performance is paramount

and you worry about

Arvid_Norstrom:

use arrays and get constant or even logarithmic time

Arvid_Norstrom:

quickly access the correct ticker when updating its price

I suspect that the array is quite large and that updating the price is a frequent use-case.

If it is the case, then for performance reasons you might want to completely go away from embedding stock_prices in your main documents. Somewhere in the storage engine documentation you will see than when a document is updated the whole document is written back to disk. So despite the fact that you may access/update easily the prices your performances might be hindered by writing to disk the whole array/sub-document.

I think it is worth investigating moving stock_prices in its own collection and to use $lookup to present the results in its current format. I think it would have a lot more impact in your update performance compared to the current schema.

Topic	Replies	Views	Activity
How do I view logs and metrics for a specific GraphQL query to MongoDB? Working with Data queries graphql	0	560	Apr 2024
Does MongoDB preserve array order in document Working with Data aggregation	1	815	May 2024
Can’t we use noCursorTimeout for timeseries collection? Working with Data aggregation queries java time-series spring-data-odm	0	472	Jun 2024
Filtering documents between older dates give increased results Working with Data queries	1	257	Jul 2024
Cant access data in terminal Working with Data queries node-js	6	338	Aug 2024

Listening to changes in nested objects

New & Unread Topics

Want to read more? Browse other topics in Working with Data or view latest topics.