Deploying MongoDB Across Multiple Kubernetes Clusters With MongoDBMulti

Arek Borucki11 min read • Published Jan 13, 2023 • Updated Sep 05, 2023

Kubernetes Connectors

Rate this tutorial

This article is part of a three-parts series on deploying MongoDB across multiple Kubernetes clusters using the operators.

Deploying the MongoDB Enterprise Kubernetes Operator on Google Cloud
Mastering MongoDB Ops Manager
Deploying MongoDB Across Multiple Kubernetes Clusters With MongoDBMulti

With the latest version of the MongoDB Enterprise Kubernetes Operator, you can deploy MongoDB resources across multiple Kubernetes clusters! By running your MongoDB replica set across different clusters, you can ensure that your deployment remains available even in the event of a failure or outage in one of them. The MongoDB Enterprise Kubernetes Operator's Custom Resource Definition (CRD), MongoDBMulti, makes it easy to run MongoDB replica sets across different Kubernetes environments and provides a declarative approach to deploying MongoDB, allowing you to specify the desired state of your deployment and letting the operator handle the details of achieving that state.

⚠️ Support for multi-Kubernetes-cluster deployments of MongoDB is a preview feature and not yet ready for Production use. The content of this article is meant to provide you with a way to experiment with this upcoming feature, but should not be used in production as breaking changes may still occur. Support for this feature during preview is direct with the engineering team and on a best-efforts basis, so please let us know if trying this out at kubernetes-product@mongodb.com. Also feel free to get in touch with any questions, or if this is something that may be of interest once fully released.

Overview of MongoDBMulti CRD

Developed by MongoDB, MongoDBMulti Custom Resource allows for the customization of resilience levels based on the needs of the enterprise application.

Single region (Multi A-Z) consists of one or more Kubernetes clusters where each cluster has nodes deployed in different availability zones in the same region. This type of deployment protects MongoDB instances backing your enterprise applications against zone and Kubernetes cluster failures.
Multi Region consists of one or more Kubernetes clusters where you deploy each cluster in a different region, and within each region, deploy cluster nodes in different availability zones. This gives your database resilience against the loss of a Kubernetes cluster, a zone, or an entire cloud region.

By leveraging the native capabilities of Kubernetes, the MongoDB Enterprise Kubernetes Operator performs the following tasks to deploy and operate a multi-cluster MongoDB replica set:

Creates the necessary resources, such as Configmaps, secrets, service objects, and StatefulSet objects, in each member cluster. These resources are in line with the number of replica set members in the MongoDB cluster, ensuring that the cluster is properly configured and able to function.
Identifies the clusters where the MongoDB replica set should be deployed using the corresponding MongoDBMulti Custom Resource spec. It then deploys the replica set on the identified clusters.
Watches for the creation of the MongoDBMulti Custom Resource spec in the central cluster.
Uses a mounted kubeconfig file to communicate with member clusters. This allows the operator to access the necessary information and resources on the member clusters in order to properly manage and configure the MongoDB cluster.
Watches for events related to the CentralCluster and MemberCluster in order to confirm that the multi-Kubernetes-cluster deployment is in the desired state.

You should start by constructing a central cluster. This central cluster will host the Kubernetes Operator, MongoDBMulti Custom Resource spec, and act as the control plane for the multi-cluster deployment. If you deploy Ops Manager with the Kubernetes Operator, the central cluster may also host Ops Manager.

You will also need a service mesh. I will be using Istio, but any service mesh that provides a fully qualified domain name resolution between pods across clusters should work.

Communication between replica set members happens via the service mesh, which means that your MongoDB replica set doesn't need the central cluster to function. Keep in mind that if the central cluster goes down, you won't be able to use the Kubernetes Operator to modify your deployment until you regain access to this cluster.

Using the MongoDBMulti CRD

Alright, let's get started using the operator and build something! For this tutorial, we will need the following tools:

We need to set up a master Kubernetes cluster to host the MongoDB Enterprise Multi-Cluster Kubernetes Operator and the Ops Manager. You will need to create a GKE Kubernetes cluster by following the instructions in Part 1 of this series. Then, we should install the MongoDB Multi-Cluster Kubernetes Operator in the mongodb namespace, along with the necessary CRDs. This will allow us to utilize the operator to effectively manage and operate our MongoDB multi cluster replica set. For instructions on how to do this, please refer to the relevant section of Part 1. Additionally, we will need to install the Ops Manager, as outlined in Part 2 of this series.

Creating the clusters

After master cluster creation and configuration, we need three additional GKE clusters, distributed across three different regions: us-west2, us-central1, and us-east1. Those clusters will host MongoDB replica set members.

1 CLUSTER_NAMES=(mdb-cluster-1 mdb-cluster-2 mdb-cluster-3)
2 ZONES=(us-west2-a us-central1-a us-east1-b)
3 
4 for ((i=0; i<${#CLUSTER_NAMES[@]:0:1}; i++)); do
5   gcloud container clusters create "${CLUSTER_NAMES[$i]}" \
6     --zone "${ZONES[$i]}" \
7     --machine-type n2-standard-2 --cluster-version="${K8S_VERSION}" \
8     --disk-type=pd-standard --num-nodes 1
9 done

The clusters have been created, and we need to obtain the credentials for them.

1 for ((i=0; i<${#CLUSTER_NAMES[@]:0:1}; i++)); do
2   gcloud container clusters get-credentials "${CLUSTER_NAMES[$i]}" \
3     --zone "${ZONES[$i]}"
4 done

After successfully creating the Kubernetes master and MongoDB replica set clusters, installing the Ops Manager and all required software on it, we can check them using [kubectx](https://github.com/ahmetb/kubectx).

1 kubectx

You should see all your Kubernetes clusters listed here. Make sure that you only have the clusters you just created and remove any other unnecessary clusters using kubectx -d <cluster_name> for the next script to work.

1 gke_lustrous-spirit-371620_us-central1-a_mdb-cluster-2
2 gke_lustrous-spirit-371620_us-east1-b_mdb-cluster-3
3 gke_lustrous-spirit-371620_us-south1-a_master-operator
4 gke_lustrous-spirit-371620_us-west2-a_mdb-cluster-1

We need to create the required variables: MASTER for a master Kubernetes cluster, and MDB_1, MDB_2, and MDB_3 for clusters which will host MongoDB replica set members. Important note: These variables should contain the full Kubernetes cluster names.

1 KUBECTX_OUTPUT=($(kubectx))
2 CLUSTER_NUMBER=0
3 for context in  "${KUBECTX_OUTPUT[@]}"; do
4  if [[ $context == *"master"* ]]; then
5     MASTER="$context"
6  else
7     CLUSTER_NUMBER=$((CLUSTER_NUMBER+1))
8     eval  "MDB_$CLUSTER_NUMBER=$context"
9  fi
10 done

Your clusters are now configured and ready to host the MongoDB Kubernetes Operator.

Installing Istio

Install Istio (I'm using v 1.16.1) in a multi-primary mode on different networks, using the install_istio_separate_network script. To learn more about it, see the Multicluster Istio documentation. I have prepared a code that downloads and updates install_istio_separate_network.sh script variables to currently required ones, such as full K8s cluster names and the version of Istio.

1 REPO_URL="https://github.com/mongodb/mongodb-enterprise-kubernetes.git"
2 SUBDIR_PATH="mongodb-enterprise-kubernetes/tools/multicluster"
3 SCRIPT_NAME="install_istio_separate_network.sh"
4 ISTIO_VERSION="1.16.1"
5 git clone "$REPO_URL"
6 for ((i = 1; i <= ${#CLUSTER_NAMES[@]}; i++)); do
7   eval mdb="\$MDB_${i}"
8   eval k8s="CTX_CLUSTER${i}"
9   sed -i'' -e "s/export ${k8s}=.*/export CTX_CLUSTER${i}=${mdb}/"  "$SUBDIR_PATH/$SCRIPT_NAME"
10 done
11 sed -i'' -e "s/export VERSION=.*/export VERSION=${ISTIO_VERSION}/"  "$SUBDIR_PATH/$SCRIPT_NAME"

Install Istio in a multi-primary mode on different Kubernetes clusters via the following command.

1 yes | "$SUBDIR_PATH/$SCRIPT_NAME"

Execute the multi-cluster kubeconfig creator tool. By default, the Kubernetes Operator is scoped to the mongodb namespace, although it can be installed in a different namespace as well. Navigate to the directory where you cloned the Kubernetes Operator repository in an earlier step, and run the tool. Got to Multi-Cluster CLI documentation to lean more about multi cluster cli.

1 CLUSTERS=$MDB_1,$MDB_2,$MDB_3
2 cd  "$SUBDIR_PATH"
3 go run main.go setup \
4   -central-cluster="${MASTER}" \
5   -member-clusters="${CLUSTERS}" \
6   -member-cluster-namespace="mongodb" \
7   -central-cluster-namespace="mongodb"

Verifying cluster configurations

Let's check the configurations we have made so far. I will switch the context to cluster #2.

1 kubectx $MDB_2

You should see something like this in your terminal.

1 Switched to context "gke_lustrous-spirit-371620_us-central1-a_mdb-cluster-2"

We can see istio-system and mongodb namespaces created by the scripts

1 kubectl get ns
2 
3 NAME              STATUS   AGE
4 default           Active   62m
5 istio-system      Active   7m45s
6 kube-node-lease   Active   62m
7 kube-public       Active   62m
8 kube-system       Active   62m
9 mongodb           Active   41s

and the MongoDB Kubernetes operator service account is ready.

1 kubectl -n mongodb get sa
2 
3 default                                     1         55s
4 mongodb-enterprise-operator-multi-cluster   1         52s

Next, execute the following command on the clusters, specifying the context for each of the member clusters in the deployment. The command adds the label istio-injection=enabled' to the'mongodb namespace on each member cluster. This label activates Istio's injection webhook, which allows a sidecar to be added to any pods created in this namespace.

1 CLUSTER_ARRAY=($MDB_1 $MDB_2 $MDB_3)
2 for CLUSTER in "${CLUSTER_ARRAY[@]}"; do     
3   kubectl label --context=$CLUSTER namespace mongodb istio-injection=enabled
4 done

Installing the MongoDB multi cluster Kubernetes operator

Now the MongoDB Multi Cluster Kubernetes operator must be installed on the master-operator cluster and be aware of the all Kubernetes clusters which are part of the Multi Cluster. This step will add the multi cluster Kubernetes operator to each of our clusters.

First, switch context to the master cluster.

1 kubectx $MASTER

The mongodb-operator-multi-cluster operator needs to be made aware of the newly created Kubernetes clusters by updating the operator config through Helm. This procedure was tested with mongodb-operator-multi-cluster version 1.16.3.

1 helm upgrade --install mongodb-enterprise-operator-multi-cluster mongodb/enterprise-operator \
2   --namespace mongodb \
3   --set namespace=mongodb \
4   --version="${HELM_CHART_VERSION}" \
5   --set operator.name=mongodb-enterprise-operator-multi-cluster \
6   --set "multiCluster.clusters={${CLUSTERS}}" \
7   --set operator.createOperatorServiceAccount=false \
8   --set multiCluster.performFailover=false

Check if the MongoDB Enterprise Operator multi cluster pod on the master cluster is running.

1 kubectl -n mongodb get pods

1 NAME                                                   READY STATUS    RESTARTS   AGE
2 mongodb-enterprise-operator-multi-cluster-688d48dfc6    1/1  Running 0  8s

It's now time to link all those clusters together using the MongoDB Multi CRD. The Kubernetes API has already been extended with a MongoDB-specific object - mongodbmulti.

1 kubectl -n mongodb get crd | grep multi

1 mongodbmulti.mongodb.com

You should also review after the installation logs and ensure that there are no issues or errors.

1 POD=$(kubectl -n mongodb get po|grep operator|awk '{ print $1 }')
2 kubectl -n mongodb logs -f po/$POD

We are almost ready to create a multi cluster MongoDB Kubernetes replica set! We need to configure the required service accounts for each member cluster.

1 for CLUSTER in "${CLUSTER_ARRAY[@]}"; do
2   helm template --show-only templates/database-roles.yaml mongodb/enterprise-operator --namespace "mongodb" | kubectl apply -f - --context=${CLUSTER} --namespace mongodb; 
3 done

Also, let's generate Ops Manager API keys and add our IP addresses to the Ops Manager access list. Get the Ops Manager (created as described in Part 2) URL. Make sure you switch the context to master.

1 kubectx $MASTER
2 URL=http://$(kubectl -n "${NAMESPACE}" get svc ops-manager-svc-ext -o jsonpath='{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].port}')
3 echo $URL

Log in to Ops Manager, and generate public and private API keys. When you create API keys, don't forget to add your current IP address to API Access List.

To do so, log in to the Ops Manager and go to ops-manager-db organization.

Click Access Manager on the left-hand side, and choose Organization Access then choose Create API KEY in the top right corner.

The key must have a name (I use mongodb-blog) and permissions must be set to Organization Owner .

When you click Next, you will see your Public Keyand Private Key. Copy those values and save them --- you will not be able to see the private key again. Also, make sure you added your current IP address to the API access list.

Get the public and private keys generated by the API key creator and paste them into the Kubernetes secret.

1 kubectl apply -f - <<EOF
2 apiVersion: v1
3 kind: Secret
4 metadata:
5   name: multi-organization-secret
6   namespace: mongodb
7 stringData:
8   publicKey: <PUBLIC KEY>
9   privateKey: <PRIVATE_KEY>
10 EOF

You also need an Organization ID. You can see the organization ID by clicking on the gear icon in the top left corner.

Copy the Organization ID and paste to the Kubernetes config map below.

1 kubectl apply -f - <<EOF
2 apiVersion: v1
3 kind: ConfigMap
4 metadata:
5   name: multi-project
6   namespace: mongodb
7 data:
8   baseUrl: "${URL}"
9   orgId: <YOUR_ORG_ID>
10 EOF

The Ops Manager instance has been configured, and you have everything needed to add the MongoDBMultiCRD to your cluster.

Using the MongoDBMultiCRD

Finally, we can create a MongoDB replica set that is distributed across three Kubernetes clusters in different regions. I have updated the Kubernetes manifest with the full names of the Kubernetes clusters. Let's apply it now!

1 MDB_VERSION=6.0.2-ent
2 kubectl apply -f - <<EOF
3 apiVersion: mongodb.com/v1
4 kind: MongoDBMulti
5 metadata:
6    name: multi-replica-set
7    namespace: mongodb
8 spec:
9    version: "${MDB_VERSION}" 
10    type: ReplicaSet
11    persistent: true 
12    duplicateServiceObjects: true 
13    credentials: multi-organization-secret 
14    opsManager:
15      configMapRef:
16        name: multi-project 
17    clusterSpecList:
18      clusterSpecs:
19      - clusterName: ${MDB_1} 
20        members: 1
21      - clusterName: ${MDB_2}
22        members: 1
23      - clusterName: ${MDB_3}    
24        members: 1
25 EOF

We should check the operator pod logs. There is a possibility we will have to update Ops Manager API access list by additional IP address. Create a variable with the operator pod.

1 POD=$(kubectl -n mongodb get po|grep operator|awk '{ print $1 }')

Check if the operator pod is allowed to access Ops Manager REST API.

1 kubectl -n mongodb logs -f po/$POD|grep IP_ADDRESS_NOT_ON_ACCESS_LIST

If we receive an error output similar to the following, we should add the displayed, additional IP address to the Ops Manager API access list, as we did in the previous step.

Status: 403 (Forbidden), ErrorCode: IP_ADDRESS_NOT_ON_ACCESS_LIST, Detail: IP address 10.206.15.226 is not allowed to access this resource.","MultiReplicaSet":"mongodb/multi-cluster"

After a few minutes, we should have our multi cluster ready for use! We can verify this by displaying the 'mongodb multi' object.

1 kubectl -n mongodb get mdbm

1 NAME            PHASE     AGE
2 multi-cluster   Running   4m25s

We can check that MongoDB replica set is running across different Kubernetes environments in different regions! The operator has performed all necessary configurations and changes to achieve the desired state of the multi cluster. The operator will also monitor our multi cluster and respond in case of any issues.

1 for CLUSTER in "${CLUSTER_ARRAY[@]}"; do
2   kubectl -n mongodb --context=${CLUSTER}  get pods
3 done

Output of the loop:

1 gke_lustrous-spirit-371620_us-central1-a_mdb-cluster-2
2 NAME                READY   STATUS    RESTARTS      AGE
3 multi-cluster-0-0   2/2     Running   1 (13m ago)   13m
4 
5 gke_lustrous-spirit-371620_us-east1-b_mdb-cluster-3
6 NAME                READY   STATUS    RESTARTS      AGE
7 multi-cluster-1-0   2/2     Running   1 (12m ago)   12m
8 
9 gke_lustrous-spirit-371620_us-west2-a_mdb-cluster-1
10 NAME                READY   STATUS    RESTARTS      AGE
11 multi-cluster-2-0   2/2     Running   1 (12m ago)   12m

We can also see that the MongoDB Kubernetes Operator has handled the storage part and created a StatefulSet and associated objects.

1 for CLUSTER in "${CLUSTER_ARRAY[@]}"; do
2   kubectl -n mongodb --context=${CLUSTER}  get sts
3 done

Output of the loop:

1 gke_lustrous-spirit-371620_us-central1-a_mdb-cluster-2
2 NAME              READY   AGE
3 multi-cluster-0   1/1     16m
4 
5 gke_lustrous-spirit-371620_us-east1-b_mdb-cluster-3
6 NAME              READY   AGE
7 multi-cluster-1   1/1     15m
8 
9 gke_lustrous-spirit-371620_us-west2-a_mdb-cluster-1
10 NAME              READY   AGE
11 multi-cluster-2   1/1     15m

The multi cluster replica set is also visible in the Ops Manager. The Ops Manager will now be responsible for backup, alerting, monitoring, rolling upgrades, and automation.

Basic troubleshooting

If something goes wrong, you can start an investigation by checking the MongoDB Kubernetes Operator logs on the master cluster. The most common problem is not including IP addresses in the list of IP addresses for the API key. You can see the error details by finding the name of the operator pod and listing the latest logs.

1 POD=$(kubectl -n mongodb get po|grep operator-multi|awk '{ print $1 }') 
2 kubectl -n mongodb logs -f po/$POD

You can use the kubectl command to view the logs of the database pods. The main container processes continually tail the Automation Agent and MongoDB logs and can be viewed using the following command:

1 kubectl logs $POD -n mongodb

A common technique for troubleshooting issues is to use ssh to connect to one of the containers running MongoDB. Once connected, you can use various Linux tools to view the processes, troubleshoot issues, and even check the MongoDB shell connections (which can be helpful in diagnosing network issues).

1 kubectl exec -it $POD -n mongodb -- /bin/bash

Conclusion

MongoDB Enterprise Kubernetes Operator's latest version allows users to deploy MongoDB resources across multiple Kubernetes clusters, improving reliability and reducing downtime. Developed by MongoDB, MongoDBMulti Custom Resource Definition (CRD) makes it easy to run MongoDB replica sets across multiple Kubernetes environments and provides a declarative approach to deploying MongoDB, allowing users to specify the desired state of their deployment and letting the operator handle the details.

In combination with Ops Manager, a multi-region cluster creates a highly available database system with enterprise-class tools for backup, monitoring, alerting, upgrades, and configuration.

As Kubernetes becomes increasingly popular, it's important to start leveraging its capabilities in your organization. Do you want to stay up-to-date on the latest developments in MongoDB on Kubernetes? Be sure to check out MongoDB community forum for the latest discussions and resources.

Rate this tutorial

This is part of a series

Deploying MongoDB Across Multiple Kubernetes Clusters

Learn How to Leverage MongoDB Data Within Kafka with New Tutorials!

Sep 17, 2024 | 1 min read

Tutorial

Go to MongoDB Using Kafka Connectors - Ultimate Agent Guide

Sep 17, 2024 | 7 min read

Article

Streaming Data With Apache Spark and MongoDB

Aug 28, 2024 | 7 min read