6 / 6
Jul 2024

Using the .NET library with Realm-Atlas.

In the main, the sync is working brilliantly. But there is one collection that is giving me grief and not syncing. I have likely mucked something up but cannot figure out what I did wrong. Once I figure that out, I hope to be able to get it fixed…

I added an optional array of strings to a collection. The existing documents do not have that set, and that’s okay. I think.

My .NET client is subscribed to the collection and the documents that are in Atlas are coming down to my local Realm file as part of the initial sync.

I add string values to the array. Note that by default the .NET library will create an empty IList<string> if the value in realm is null. That’s good.

Some of the documents get values added, other don’t.

Nothing syncs from Realm up to Atlas.

I thought it might be because none of the documents had any data in them. So I used Compass to set the value of all documents to an empty array. I deleted my local Realm file and started over. Exact same behavior. The writes are visible locally using Realm Studio, they are visible to my code… but modifications are NOT going back up.

Using Compass I look at my sync session database and there is nothing from that collection that is in the unsynced_documents store. I just learned about that today so I’m glad I found it… there are other errors that show up in there. That’s good to know.

There are NO exceptions being thrown when writing the data or syncing. There are NO error messages either locally or in any of the sync logs in Atlas.

This error is incredibly opaque and I’m very frustrated and need to get it solved.

Any pointers appreciated. Especially if you know of somewhere else that I can look for error messages.

Went back to Compass and set all entities to have the array with a single hard-coded string in the array. Deleted local Realm and synced. All entities came down with the hard-coded string.

Ran my code, updated the array, verified array updated using Realm Studio.

Changes will not sync back up to Atlas. Cannot find any errors anywhere.

Also, changes in Atlas will not sync down to local Realm file.

Weird and frustrating.

Are there any errors in the server logs? If not, then can you set the log level for the sync category to trace and run the app and make some changes to the Realm file? We’re in the process of updating the docs, but this should look something like:

RealmLogger.SetLogLevel(LogLeve.Trace, LogCategory.Realm.Sync);

RealmLogger.SetLogLevel was a complier error for me which made me realize that I was a couple of versions behind in my library version. Updated, retested, same symptoms.

But… the logs seemed to actually be useful.

One of the things that I did was to add a WaitForUploadAsync() after my business logic ran. It was taking an incredibly long time (15+ minutes before I killed the app) but there were log entries being generated with some sets of data being uploaded.

That told me that there was some type of blockage in the sync process. So that led me to make some changes in my code.

Change 1 - Finer search of entities needing to be updated. That reduced the volume of data being exchanged with Atlas. It cost a bit more in time spent in the CPU doing the search but the smaller payloads running across the network really seem to help.

Change 2 - WaitForUploadAsync() after EVERY write. That minimizes the total backlog being shipped up to Atlas.

Those two changes ended up dropping total sync time that I was killing after 15 minutes to less than 1 minute.

So there is SOMETHING that was sticking in the queue. I’m not well enough versed in reading the logs to be able to say what or where. But there was something sticking that adding the WaitForUploadAsync() after every write cleared it up.

I have 48MB and 52MB log files (compressed to 34MB and 38MB respectively) that I can send to you if you have some way of me delivering them securely.

WaitForUploadAsync after every write seems quite excessive and if that’s helping, it’s likely because it’s causing throttling of the writes - i.e. your app is generating fewer changes, thus appearing to sync faster, when in reality it’s just syncing less data. Realm starts uploading data as soon as a transaction is committed, so the fact that there’s a perceived backlog of stuff to be uploaded makes me think your application is generating a lot of changes in a short period of time - is that the case? Also, can you clarify what are the types of writes you’re making - are you trying to batch multiple changes in a single transaction or are you making every change in a separate transaction?

Yes… we are generating a LOT of changes in a short period of time.

Most transactions will be tiny - 3-5 documents updated with about 25% of the collections having updated documents. But there are times when the customers make a bunch of changes and there can be over 1,000 large documents updated at a time.

My current testing is simulating that larger load to take it into account. I prefer to test pessimistically.

If by “multiple changes in a single transaction” you mean “writing to multiple collections” then the answer is “No, I’m only ever writing to a single collection in any given transaction.”

I can have multiple transactions writing at the same time. If I have 2,500 different customers and I need to update 20 different entities for each customer, each customer process is running in its own thread and each entity runs in its own thread. So that’s a lot of potential threads.

I’m not actually doing that though. I do limit the number of each that are active at any time. But lets say I have the processing updating 8 customers and 10 collections active at any time so I could be writing to 80 different transactions all at once.

Based on your commentary I also updated my code so that I am only ever writing 100 documents to a collection in any given transaction. So if I have 350 documents to be updated, I have three transactions with 100 documents and then a fourth transaction with 50 documents.

Now that I’m processing faster I will look at removing the WaitForUploadAsync to see what the impact might be.

Did I answer your questions?

Do you have any suggestions? Are there any best practices links I should take a look at?