Synchronize Your Mobile Application With MongoDB Atlas and Google Cloud MySQL
Timothy Marland6 min read • Published Feb 08, 2024 • Updated Feb 08, 2024
Rate this tutorial
Enterprises around the world are looking to modernize their existing applications. They need a streamlined way to synchronize data from devices at the Edge into their cloud data stores. Whether their goals are business growth or fending off the competition, application modernization is the primary vehicle that will help them get there.
Often the first step in this process is to move data from an existing relational database repository (like Oracle, SQL Server, DB2, or Postgres, for example) into a JSON-based flexible database in the cloud (like MongoDB, Aerospike, Couchbase, Cassandra, or DocumentDB). Sounds simple, right? I mean, really, if JSON (NoSQL) is so simple and flexible, why would data migration be hard? There must be a bunch of automated tools to facilitate this data migration, right?
Unfortunately, the answers are “Not really,” “Because data synchronization is rarely simple,” and “The available tools are often DIY-based and don’t provide nearly the level of automation required to facilitate an ongoing, large-scale, production-quality, conflict-resolved data synchronization.”
One of the first challenges is data modeling. To effectively leverage the benefits inherent in a JSON-based schema, you need to include data modeling as part of your migration strategy. Simply flattening or de-normalizing a relational schema into nested JSON structures, or worse yet, simply moving from relational to JSON without any data modeling consideration, results in a JSON data repository that is slow, inefficient, and difficult to query. You need an intelligent data modeling platform that automatically creates the most effective JSON structures based on your application needs and the target JSON repository without requiring specialized resources like data scientists and data engineers.
Once you’ve mapped the data, you need tools that allow you to build reliable, scalable data pipelines to move the data from the source to the target repository. Sadly, most of the tools available today are primarily DIY scripting tools that require both custom (often complex) coding to transform the data to the new schema properly and custom (often complex) monitoring to ensure that the new data pipelines are working reliably. You need a data pipeline automation and monitoring platform to move the data and ensure its quality.
This process of data synchronization, pipeline automation, and monitoring is where most application modernization projects get bogged down and/or ultimately fail. These failed projects often consume significant resources before they fail, as well as affect the overall business functionality and outcomes, and lead to missed objectives.
Synchronizing data between edge devices and various databases can be complex. Simplifying this is our goal, and we will demonstrate how to achieve bi-directional synchronization between mobile devices at MySQL in the cloud using MongoDB Atlas Device Sync and Dataworkz.
Let's dive in.
- Kafka
- Debezium
Set up a template app for this test by following the steps outlined in the docs. Once that step is complete, you will have a mobile application running locally, with automated synchronization back to MongoDB Atlas using the Atlas Device Sync SDK.
- Access your Google Cloud MySQL instance.
- In the left navigation pane, under Databases, select Backups.
- Access your Google Cloud MySQL instance.
- In Settings, check the status of Automated backups.
- If the status is Disabled, follow the steps below.
These steps are necessary to set up some of the background processes which enable ongoing two-way synchronization.
- In Backups, Settings, click + CREATE BACKUP.
- In the Edit backup settings dialog:
- Select the Automate backups and Enable point-in-time recovery check boxes.
- Click SAVE to activate the changes.
- Check the status of Automated backups again under Settings. It should appear as shown below. This step is important because the metadata created in the backup process is used by the tools to help enable bi-directional synchronization.
The connector for MySQL needs to be defined in Dataworkz.
- Log in to the Dataworkz application.
- Go to Configuration -> Databases -> MySQL.
- Click the + icon in the above image to add the configuration.
- It will open the MySQL configuration screen.
- Provide the following details in the configuration screen:
- MySQL connector name
- Host server and port
- User name
- Password
- Select Deployment Type
- Select TLS/SSL option as appropriate
- Click Save to add the MySQL configuration.
The connector for MongoDB Atlas needs to be defined in Dataworkz.
- Log in to the Dataworkz application.
- Go to Configuration -> Databases -> MongoDB.
- Click the + icon in the above image to add the configuration.
- It will open the MongoDB configuration screen.
- Provide the following details in the configuration screen (these can be accessed by clicking on the Connect button displayed below).
- MongoDB Connector Name
- Host server
- User name
- Password
- Click Save to add the MongoDB configuration.
The connector for Kafka needs to be defined in Dataworkz.
- Log in to the Dataworkz application.
- Go to Configuration -> Streaming Systems -> Kafka.
- Click the + icon in the above image to add the configuration.
- It will open the Kafka configuration screen.
- Provide the following details in the configuration screen:
- Kafka Connector Name
- Bootstrap server name and port (as per the format in the image below)
- Zookeeper (optional)
- Click Save to add the Kafka configuration.
Configure CDC for MySQL in Dataworkz by following these steps:
- Log in to the Dataworkz application.
- Go to Configuration -> Change Data Capture -> MySQL.
- Click the + icon in the above image to add the configuration.
- It will open the CDC configuration screen.
- Provide the details in the configuration detail screen:
- Choose the Workspace.
- Choose Kafka (which was configured in the previous step).
- Add the Debezium server name.
- Choose the MYSQL Storage that was previously configured.
- Select the appropriate database from the list.
- Choose the Database table for Sync.
You can see the configured MySQL CDC in the list of configurations.
This step enables the modeling of the results of RDBMS in JSON structure. To do this, select Data Prep -> JSON Modeller inside the Dataworkz settings.
- Click on Select root dataset for json modeling (above image).
- Select the Workspace in which MySQL CDC was configured.
- Select cdc_kafka as the Source Collection.
- Choose root/base table/dataset.
- Click on the Enable Bi-directional sync checkbox.
- Select a sync field from the list of the columns that will be used to determine the last updated time for a row.
- Select columns for the target dataset.
- Click Update and you will see the below screen.
- Choose the desired mapping with the base table by right-clicking.
- If the table has one-to-many mapping, choose Create array document.
- If the table has one-to-one mapping, choose Create field in child document
- Selecting create array document would require selecting a child table relationship (for creation of an array document).
- Choose Child Table in the screen that pops up.
- Choose join key in the Relationship mapping section.
- In the Unique Key section, choose the unique key for the child table (used for update or delete events).
- Select a sync field from the list of the columns that will be used to determine the last updated time for a row.
- Click update.
- A one-to-many relationship shows up.
- Click on Save and choose the target destination (MongoDB Collection, in this case).
- Choose Transfer Mode as Upsert and select the unique key for the base table in the pop-up box.
- Choose the desired job name and select the job frequency as Recurring. Set this job as Continuous from the Choose Frequency section.
The sync job that was scheduled in the previous step can be viewed under Jobs -> Continuous Job” in the Dataworkz portal. It provides the following:
- job status (running/deleted/paused)
- ability to pause a running job
- ability to resume a paused job
- ability to delete a continuous job
MongoDB Atlas Device Sync streamlines and automates the challenges of mobile data synchronization with Atlas. By integrating with Dataworkz, that same data can now be automatically and bi-directionally synchronized with MySQL, Oracle and other databases.
Set up your account at cloud.mongodb.com and dataworkz.com to create accounts and begin your automated bi-directional data synchronization journey.
Top Comments in Forums
There are no comments on this article yet.