Troubleshoot Deployments with Multiple Kubernetes Clusters
On this page
To troubleshoot your multi-Kubernetes-cluster deployments, use the procedures in this section.
Recover from a Kubernetes Cluster Failure
This procedure uses the same cluster names as in the Prerequisites.
If the cluster MDB_CLUSTER_1
that holds MongoDB nodes goes down, and
if you provision a new cluster named MDB_CLUSTER_4
instead of
MDB_CLUSTER_1
to hold the new MongoDB nodes, run the
MongoDB kubectl plugin
with the updated list of member clusters, and then edit the MongoDBMultiCluster
resource
spec on the central cluster.
To reconfigure the multi-Kubernetes-cluster deployment after a cluster failure, replace the failed Kubernetes cluster with the newly provisioned cluster as follows:
Run the MongoDB kubectl plugin with the
recover
parameter and the new clusterMDB_CLUSTER_4
specified in the-member-clusters
option. This enables the Kubernetes Operator to communicate with the new cluster to schedule MongoDB nodes on it. In the following example,-member-clusters
contains${MDB_CLUSTER_4_FULL_NAME}
.kubectl mongodb multicluster recover \ --central-cluster="MDB_CENTRAL_CLUSTER_FULL_NAME" \ --member-clusters="${MDB_CLUSTER_2_FULL_NAME},${MDB_CLUSTER_3_FULL_NAME},${MDB_CLUSTER_4_FULL_NAME}" \ --member-cluster-namespace="mongodb" \ --central-cluster-namespace="mongodb" \ --operator-name=mongodb-enterprise-operator-multi-cluster \ --source-cluster="${MDB_CLUSTER_2_FULL_NAME}" On the central cluster, locate and edit the
MongoDBMultiCluster
resource spec to add the new cluster name to theclusterSpecList
and remove the failed Kubernetes cluster from this list. The resulting list of cluster names should be similar to the following example:clusterSpecList: - clusterName: ${MDB_CLUSTER_4_FULL_NAME} members: 3 - clusterName: ${MDB_CLUSTER_2_FULL_NAME} members: 2 - clusterName: ${MDB_CLUSTER_3_FULL_NAME} members: 3 Restart the Kubernetes Operator Pod. After the restart, the Kubernetes Operator should reconcile the MongoDB deployment on the newly created
MDB_CLUSTER_4
cluster that you created as a replacement for theMDB_CLUSTER_1
failure. To learn more about resource reconciliation, see Deployment Architecture and Diagrams.
Also see ConfigMap Name mongodb-enterprise-operator-member-list is Hard-Coded.