@Jack_Woehr Thanks for your response, I will try to give you an example about the situation and why I’m trying to go for this approach.
Currently, we have information that needs to be filtered based on specific criteria. Each filter can match a number of users, and since these users are distributed across different databases, we need to store this information gradually in temporary storage (called conclusions). Later, using an AND operation, we match the players who meet the criteria in more than one filter and group them together.
The challenge here is that a single filter could match over 250,000 users. Additionally, these conclusions are temporary and only remain in the database for a limited period of time.
Example 1:
Let’s say we have two filters:
- Filter 1: Players from the United States
- Filter 2: Players who play a certain game
If Filter 1 matches 300,000 users and Filter 2 matches 150,000, we store these matches temporarily in the conclusions. Then, we use an AND operation to find the users who match both filters (e.g., players from the United States who also play the specific game). The problem here is that we are inserting information into this collection in batches of about 5,000 users per batch, but the process is very slow. Additionally, the information is temporary and typically stays in this collection for about 2 days before being deleted by TTL. Furthermore, this process runs in parallel. The same process can be executed multiple times in the service, depending on the number of instances available on the server. The issue lies in the insertion process: retrieving the information is very fast, but inserting the data into the collection is too slow.