Boow
(Poubelle Enorme)
1
Hi,
We have a script that read a collection X, do some processing and write in the collection Y. Usually the script takes ~1 hour but sometime it’s very very slow. What can affect performance?
I don’t see a big load on my cluster… maybe a lock on my X collection that is slowing down my script? Size of my X collection: Long(‘27776896250’).
Thanks for your help.
steevej
(Steeve Juneau)
2
Without more information from your exact cluster setup with cannot really help.
When you write read a collection X and size of X being 27_776_896_250 do you mean you read all of it and then write all of it back into a collection Y? Where is the script running compared to the server?
Boow
(Poubelle Enorme)
3
Oh yes sorry.
Where is the script running compared to the server?
- Script is in another machine but communicate with Internal network (1GB/s).
When you write read a collection X and size of X being 27_776_896_250 do you mean you read all of it and then write all of it back into a collection Y?
I read the data from the collection X, do some processing (if needed) and then I apply it to the new collection Y. I work in batches of 300. I create an empty collection, write all data and then create index (maybe it’s better to create the index before? I don’t think so because mongo will have to keep it up-to-date…).
Let me know if you have suggestion / tips / questions.
Thanks.
steevej
(Steeve Juneau)
4
Use the aggregation framework with an $out stage rather than
steevej
(Steeve Juneau)
5
One idea I got since you have
which may or not be applicable to the whole use case is to:
- take 1 node out of the replica set
- make this node a dedicated not replicated node where collection Y is written
My idea is that since you read and write a lots of data from the same cluster, you constantly swap in and out the working set. Your cluster might be overloaded by disk I/O. With the new setup, the replica set is not busy replicating all the writes and can serve the reads better.
Anyway 4 nodes is not a recommended configuration so you lose nothing by taking a node out.
Can you give more details about:
What metrics are you using for the above.
The hardware configuration of the 4 nodes.
The read and write concerns you use.
Since the script usually takes 1 hour, it looks like an analytic use-case and having dedicated notes for analytic is sometimes a good choice.
Have you tried different sizes?
I don’t know but it should be straight forward to test. But, I suspect that it may be better because the documents need to be fetched again, adding a lot of I/O again.
steevej
(Steeve Juneau)
6
@Boow, it has been a week since I provided what I think is valuable input.
I would appreciate a follow-up.
Thanks