MongoDB Orchestration With Spring & Atlas Kubernetes Operator

Aasawari Sahasrabuddhe13 min read • Published May 06, 2024 • Updated Jun 12, 2024

AWS MongoDB Java

Rate this article

In this tutorial, we'll delve into containerization concepts, focusing on Docker, and explore deploying your Spring Boot application from a previous tutorial. By the tutorial's conclusion, you'll grasp Docker and Kubernetes concepts and gain hands-on experience deploying your application within a cloud infrastructure.

This tutorial is an extension of the previous tutorial where we explained how to write advanced aggregation queries in MongoDB using the Spring Boot framework. We will use the same GitHub repository to create this tutorial's deployment files.

We'll start by learning about containers, like digital packages that hold software. Then, we'll dive into Kubernetes, a system for managing those containers. Finally, we'll use Kubernetes to set up MongoDB and our Spring application, seeing how they work together.

Prerequisites

A Spring Boot application running on your local machine
Elastic Kubernetes Service deployed on AWS using eksctl
A MongoDB Atlas account

Understanding containerization

Often as a software developer, one comes across an issue where the features of the application work perfectly on the local machine, and many features seem to be broken on the client machine. This is where the concept of containers would come in.

In simple words, a container is just a simple, portable computing environment that contains everything an application needs to run. The process of creating containers for the application to run in any environment is known as containerization.

Containerization is a form of virtualization where an application, along with all its components, is packaged into a single container image. These containers operate in their isolated environment within the shared operating system, allowing for efficient and consistent deployment across different environments.

Advantages of containerizing the application

Portability: The idea of “write once and run anywhere” encapsulates the essence of containers, enabling applications to seamlessly transition across diverse environments, thereby enhancing their portability and flexibility.
Efficiency: When configured properly, containers utilize the available resources, and also, isolated containers can perform their operations without interfering with other containers, allowing a single host to perform many functions. This makes the containerized application work efficiently and effectively.
Better security: Because containers are isolated from one another, you can be confident that your applications are running in their self-contained environment. That means that even if the security of one container is compromised, other containers on the same host remain secure.

Comparing containerization and traditional virtualization methods


Aspect	Containers	Virtual Machines
Abstraction Level	OS level virtualization	Hardware-level virtualization
Resource Overhead	Minimal	Higher
Isolation	Process Level	Stronger
Portability	Highly Portable	Less Portable
Deployment Speed	Fast	Slower
Footprint	Lightweight	Heavier
Startup Time	Almost instant	Longer
Resource Utilisation	Efficient	Less Efficient
Scalability	Easily Scalable	Scalable, but with resource overhead

Understanding Docker

Docker application provides the platform to develop, ship, and run containers. This separates the application from the infrastructure and makes it portable. It packages the application into lightweight containers that can run across without worrying about underlying infrastructures.

Docker containers have minimal overhead compared to traditional virtual machines, as they share the host OS kernel and only include necessary dependencies. Docker facilitates DevOps practices by enabling developers to build, test, and deploy applications in a consistent and automated manner. You can read more about Docker containers and the steps to install them on your local machine from their official documentation.

Understanding Kubernetes

Kubernetes, often called K8s, is an open-source orchestration platform that automates containerized applications' deployment, scaling, and management. It abstracts away the underlying infrastructure complexity, allowing developers to focus on building and running their applications efficiently.

It simplifies the deployment and management of containerized applications at scale. Its architecture, components, and core concepts form the foundation for building resilient, scalable, and efficient cloud-native systems. The Kubernetes architectures have been helpful in typical use cases like microservices architecture, hybrid and multi-cloud deployments, and DevOps where continuous deployments are done.

Let's understand a few components related to Kubernetes:

The K8s environment works in the controller-worker node architecture and therefore, two nodes manage the communication. The Master Node is responsible for controlling the cluster and making decisions for the cluster whereas the Worker node(s) is responsible for running the application receiving instructions from the Master Node and resorting back to the status.

The other components of the Kubernetes cluster are:

Pods: The basic building block of Kubernetes, representing one or more containers deployed together on the same host

ReplicaSets: Ensures that a specified number of pod replicas are running at any given time, allowing for scaling and self-healing

Services: Provide networking and load balancing for pods, enabling communication between different parts of the application

Volumes: Persist data in Kubernetes, allowing containers to share and store data independently of the container lifecycle

Namespaces: Virtual clusters within a physical cluster, enabling multiple users, teams, or projects to share a Kubernetes cluster securely

The below diagrams give a detailed description of the Kubernetes architecture.

Image describing kubernetes architecture from source: https://kubernetes.io/docs/concepts/overview/components/

Atlas Kubernetes Operator

Consider a use case where a Spring application running locally is connected to a database deployed on the Atlas cluster. Later, your organization introduces you to the Kubernetes environment and plans to deploy all the applications in the cloud infrastructure.

The question of how you will connect your Kubernetes application to the Atlas cluster running on a different environment will arise. This is when the Atlas Kubernetes Operator will come into the picture.

This operator allows you to manage the Atlas resources in the Kubernetes infrastructure.

For this tutorial, we will deploy the operator on the Elastic Kubernetes Service on the AWS infrastructure.

Step 1: Deploy an EKS cluster using eksctl. Follow the documentation, Getting Started with Amazon EKS - eksctl, to deploy the cluster. This step will take some time to deploy the cluster in the AWS.

I created the cluster using the command:

1 eksctl create cluster \
2 --name MongoDB-Atlas-Kubernetes-Operator \
3 --version 1.29 \
4 --region ap-south-1 \
5 --nodegroup-name linux-nodes \
6 --node-type t2.2xlarge \
7 --nodes 2

Step 2: Once the EKS cluster is deployed, run the command:

1 kubectl get ns

And you should see an output similar to this.

1 NAME                   STATUS        AGE
2 default                Active        18h
3 kube-node-lease        Active        18h
4 kube-public            Active        18h
5 kube-system            Active        18h

Step 3: Register a new Atlas account or log in to your Atlas account.

Step 4: As the quick start tutorial mentioned, you need the API key for the project in your Atlas cluster. You can follow the documentation page if you don’t already have an API key.

Step 5: All files that are being discussed in the following sub-steps are available in the GitHub repository.

If you are following the above tutorials, the first step is to create the API keys. You need to make sure that while creating the API key for the project, you add the public IPs of the EC2 instances created using the command in Step 1 to the access list.

This is how the access list should look like:

Atlas UI representing addition of EC2 IP addresses

Figure showing the addition of the Public IPs address to the API key access list.

The first step mentioned in the Atlas Kubernetes Operator documentation is to apply all the YAML file configurations to all the namespaces created in the Kubernetes environment. Before applying the YAML files, make sure to export the below variables using:

1 export VERSION=v2.2.0
2 export ORG_ID=<your-organisations-id>
3 export PUBLIC_API_KEY=<your-public-key>
4 export PRIVATE_API_KEY=<your-private-key>

Then, apply the command below:

1 kubectl apply -f https://raw.githubusercontent.com/mongodb/mongodb-atlas-kubernetes/$VERSION/deploy/all-in-one.yaml

To let the Kubernetes Operator create the project in Atlas, you must have certain permissions using the API key at the organizational level in the Atlas UI.

You can create the API key using the Get Started with the Atlas Administration API documentation.

Once the API key is created, create the secret with the credentials using the below command:

1 kubectl create secret generic mongodb-atlas-operator-api-key \
2     --from-literal="orgId=$ORG_ID" \
3     --from-literal="publicApiKey=$PUBLIC_API_KEY" \
4     --from-literal="privateApiKey=$PRIVATE_API_KEY" \
5  -n mongodb-atlas-system

Label the secrets created using the below command:

1 kubectl label secret mongodb-atlas-operator-api-key atlas.mongodb.com/type=credentials -n mongodb-atlas-system

The next step is to create the YAML file to create the project and deployment using the project and deployment YAML files respectively.

Please ensure the deployment files mention the zone, instance, and region correctly.

The files are available in the Git repository in the atlas-kubernetes-operator folder.

In the initial project.yaml file, the specified content initiates the creation of a project within your Atlas deployment, naming it as indicated. With the provided YAML configuration, a project named "atlas-kubernetes-operator" is established, permitting access from all IP addresses (0.0.0.0/0) within the Access List.

project.yaml:

1 apiVersion: atlas.mongodb.com/v1
2 kind: AtlasProject
3 metadata:
4   name: project-ako
5 spec:
6   name: atlas-kubernetes-operator
7   projectIpAccessList:
8     - cidrBlock: "0.0.0.0/0"
9       comment: "Allowing access to database from everywhere (only for Demo!)"

Please note that 0.0.0.0 is not recommended in the production environment. This is just for test purposes.

The next file named, deployment.yaml would create a new deployment in the project created above with the name specified as cluster0. The YAML also specifies the instance type as M10 in the AP_SOUTH_1 region. Please make sure you use the region close to you.

deployment.yaml:

1 apiVersion: atlas.mongodb.com/v1
2 kind: AtlasDeployment
3 metadata:
4   name: my-atlas-cluster
5 spec:
6   projectRef:
7     name: project-ako
8   deploymentSpec:
9     clusterType: REPLICASET
10     name: "cluster0"
11     replicationSpecs:
12       - zoneName: AP-Zone
13         regionConfigs:
14           - electableSpecs:
15               instanceSize: M10
16               nodeCount: 3
17             providerName: AWS
18             regionName: AP_SOUTH_1
19             priority: 7

The user.yaml file will create the user for your project. Before creating the user YAML file, create the secret with the password of your choice for the project.

1 kubectl create secret generic the-user-password --from-literal="password=<password for your user>"
2 kubectl label secret the-user-password atlas.mongodb.com/type=credentials

user.yaml

1 apiVersion: atlas.mongodb.com/v1
2 kind: AtlasDatabaseUser
3 metadata:
4   name: my-database-user
5 spec:
6   roles:
7     - roleName: "readWriteAnyDatabase"
8       databaseName: "admin"
9   projectRef:
10     name: project-ako
11   username: theuser
12   passwordSecretRef:
13     name: the-user-password

Once all the YAML are created, apply these YAML files to the default namespace.

1 kubectl apply -f project.yaml
2 kubectl apply -f deployment.yaml 
3 kubectl apply -f user.yaml

After this step, you should be able to see the deployment and user created for the project in your Atlas cluster.

Deploying the Spring Boot application in the cluster

In this tutorial, we'll be building upon our existing guide found on Developer Center, MongoDB Advanced Aggregations With Spring Boot, and Amazon Corretto.

We'll utilize the same GitHub repository to create a DockerFile. If you're new to this, we highly recommend following the tutorial first before diving into containerizing the application.

There are certain steps to be followed to containerize the application.

Step 1: Create a JAR file for the application. This executable JAR will be needed to create the Docker image.

To create the JAR, do:

1 mvn clean package

and the jar would be stored in the target/ folder.

Step 2: The second step is to create the Dockerfile for the application. A Dockerfile is a text file that contains the information to create the Docker image of the application.

Create a file named Dockerfile with the following content. This file describes what will run into this container.

Step 3: Build the Docker image. The docker build command will read the specifications from the Dockerfile created above.

1  docker build -t mongodb_spring_tutorial:docker_image . –load

Step 4: Once the image is built, you will need to push it to a registry. In this example, we are using Docker Hub. You can create your account by following the documentation.

1 docker tag mongodb_spring_tutorial:docker_image <your_docker_username>/mongodb_spring_tutorial
2 docker push <your_docker_username>/mongodb_spring_tutorial

Once the Docker image has been pushed into the repo, the last step is to connect your application with the database running on the Atlas Kubernetes Operator.

Connecting the application with the Atlas Kubernetes Operator

To make the connection, we need Deployment and Service files. While Deployments manage the lifecycle of pods, ensuring a desired state, Services provide a way for other components to access and communicate with those pods. Together, they form the backbone for managing and deploying applications in Kubernetes.

A Deployment in Kubernetes is a resource object that defines the desired state for your application. It allows you to declaratively manage a set of identical pods. Essentially, it ensures that a specified number of pod replicas are running at any given time.

A deployment file will have the following information. In the above app-deployment.yaml file, the following details are mentioned:

apiVersion: Specifies the Kubernetes API version
kind: Specifies that it is a type of Kubernetes resource, Deployment
metadata: Contains metadata about the Deployment, including its name

In the spec section:

The replicas specify the number of instances of the application. The name and image refer to the application image created in the above step and the name of the container that would run the image.

In the last section, we will specify the environment variable for SPRING_DATA_MONGODB_URI which will pick the value from the connectionStringStandardSrv of the Atlas Kubernetes Operator.

Create the deployment.yaml file:

1 apiVersion: apps/v1
2 kind: Deployment
3 metadata:
4   name: spring-app
5 spec:
6   replicas: 1
7   selector:
8     matchLabels:
9       app: springboot-application
10   template:
11     metadata:
12       labels:
13         app: springboot-application
14     spec:
15       containers:
16       - name: spring-app
17         image: <your_docker_username>/mongodb_spring_tutorial
18         ports:
19         - containerPort: 8080
20         env:
21         - name: SPRING_DATA_MONGODB_URI
22           valueFrom:
23             secretKeyRef:
24               name: atlas-kubernetes-operator-cluster0-theuser
25               key: connectionStringStandardSrv
26         - name: SPRING_DATA_MONGODB_DATABASE
27           value: sample_supplies
28         - name: LOGGING_LEVEL_ORG_SPRINGFRAMEWORK
29           value: INFO
30         - name: LOGGING_LEVEL_ORG_SPRINGFRAMEWORK_WEB
31           value: DEBUG

A Service in Kubernetes is an abstraction that defines a logical set of pods and a policy by which to access them. It enables other components within or outside the Kubernetes cluster to communicate with your application running on pods.

1 apiVersion: v1
2 kind: Service
3 metadata:
4   name: spring-app-service
5 spec:
6   selector:
7     app: spring-app
8   ports:
9     - protocol: TCP
10       port: 8080
11       targetPort: 8080
12   type: LoadBalancer

You can then apply those two files to your cluster, and Kubernetes will create all the pods and start the application.

1 kubectl apply -f ./*.yaml

Now, when you do…

1 kubectl get svc

…it will give you the output as below with an external IP link created. This link will be used with the default port to access the RESTful calls.

In an ideal scenario, the service file is applied with type: ClusterIP but since we need test the application with the API calls, we would be specifying the type as LoadBalancer.

You can use the external IP allocated with port 8080 and test the APIs.

Or use the following command to store the external address to the EXTERNAL_IP variable.

1 EXTERNAL_IP=$(kubectl get svc|grep spring-app-service|awk '{print $4}')
2 
3 echo $EXTERNAL_IP

It should give you the response as

1 a4874d92d36fe4d2cab1ccc679b5fca7-1654035108.ap-south-1.elb.amazonaws.com

By this time, you should be able to deploy Atlas in the Kubernetes environment and connect with the front-end and back-end applications deployed in the same environment.

Let us test a few REST APIs using the external IP created in the next section.

Tests

Now that your application is deployed, running in Kubernetes, and exposed to the outside world, you can test it with the following curl commands.

Finding sales in London
Finding total sales:
Finding the total quantity of each item

As we conclude our exploration of containerization in Spring applications, we're poised to delve into Kubernetes and Docker troubleshooting. Let us move into the next section as we uncover common challenges and effective solutions for a smoother deployment experience.

Common troubleshooting errors in Kubernetes

In a containerized environment, the path to a successful deployment can sometimes involve multiple factors. To navigate any hiccups along the way, it's wise to turn to certain commands for insights:

Examine pod status:

1 kubectl describe pods <pod-name> -n <namespace>
2 
3 kubectl get pods -n <namespace>

Check node status:

1 kubectl get nodes

Dive into pod logs:

1 kubectl get logs -f <pod-name> -n <namespace>

Explore service details:

1 kubectl get describe svc <service-name> -n <namespace>

During troubleshooting, encountering errors is not uncommon. Here are a few examples where you might seek additional information:

Image Not Found: This error occurs when attempting to execute a container with an image that cannot be located. It typically happens if the image hasn't been pulled successfully or isn't available in the specified Docker registry. It's crucial to ensure that the correct image name and tag are used, and if necessary, try pulling the image from the registry locally before running the container to ensure it’s there.
Permission Denied: Docker containers often operate with restricted privileges, especially for security purposes. If your application requires access to specific resources or directories within the container, it's essential to set appropriate file permissions and configure user/group settings accordingly. Failure to do so can result in permission-denied errors when trying to access these resources.
Port Conflicts multiple containers on the same host machine, each attempting to use the same host port, can lead to port conflicts. This issue arises when the ports specified in the docker run command overlap with ports already in use by other containers or services on the host. To avoid conflicts, ensure that the ports assigned to each container are unique and not already occupied by other processes.
Out of Disk Space: Docker relies on disk space to store images, containers, and log files. Over time, these files can accumulate and consume a significant amount of disk space, potentially leading to disk space exhaustion. To prevent this, it's advisable to periodically clean up unused images and containers using the docker system prune command, which removes dangling images, unused containers, and other disk space-consuming artifacts.
Container Crashes: Containers may crash due to various reasons, including misconfigurations, application errors, or resource constraints. When a container crashes, it's essential to examine its logs using the kubectl logs -f <pod-name> -n <namespace> command. These logs often contain valuable error messages and diagnostic information that can help identify the underlying cause of the crash and facilitate troubleshooting and resolution.
Docker Build Failures: Building Docker images can fail due to various reasons, such as syntax errors in the Dockerfile, missing files or dependencies, or network issues during package downloads. It's essential to carefully review the Dockerfile for any syntax errors, ensure that all required files and dependencies are present, and troubleshoot any network connectivity issues that may arise during the build process.
Networking Problems: Docker containers may rely on network connectivity to communicate with other containers or external services. Networking issues, such as incorrect network configuration, firewall rules blocking required ports, or DNS misconfigurations, can cause connectivity problems. It's crucial to verify that the container is attached to the correct network, review firewall settings to ensure they allow necessary traffic, and confirm that DNS settings are correctly configured.
Resource Constraints: Docker containers may require specific CPU and memory resources to function correctly. Failure to allocate adequate resources can result in performance issues or application failures. When running containers, it's essential to specify resource limits using the --cpu and --memory flags to ensure that containers have sufficient resources to operate efficiently without overloading the host system.

You can specify in the resource section of the YAML file as:

1 docker_container:
2   name: my_container
3   resources:
4     cpu: 2
5     memory: 4G

Conclusion

Throughout this tutorial, we've covered essential aspects of modern application deployment, focusing on containerization, Kubernetes orchestration, and MongoDB management with Atlas Kubernetes Operator. Beginning with the fundamentals of containerization and Docker, we proceeded to understand Kubernetes' role in automating application deployment and management. By deploying Atlas Operator on AWS's EKS, we seamlessly integrated MongoDB into our Kubernetes infrastructure. Additionally, we containerized a Spring Boot application, connecting it to Atlas for database management. Lastly, we addressed common Kubernetes troubleshooting scenarios, equipping you with the skills needed to navigate challenges in cloud-native environments. With this knowledge, you're well-prepared to architect and manage sophisticated cloud-native applications effectively.

To learn more, please visit the resource, What is Container Orchestration? and reach out with any specific questions.

As you delve deeper into your exploration and implementation of these concepts within your projects, we encourage you to actively engage with our vibrant MongoDB community forums. Be sure to leverage the wealth of resources available on the MongoDB Developer Center and documentation to enhance your proficiency and finesse your abilities in harnessing the power of MongoDB and its features.

Top Comments in Forums

There are no comments on this article yet.

Start the Conversation

Rate this article

Article

Active-Active Application Architectures with MongoDB

Sep 23, 2022 | 16 min read

Article

Building REST APIs With API Platform and MongoDB

Jan 14, 2025 | 9 min read

Tutorial

Develop MongoDB Locally With TLS

Jan 17, 2025 | 6 min read