Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
Java
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Languageschevron-right
Javachevron-right

Bring Sharding to Your Spring Boot App with Spring Data MongoDB

Tim Kelly4 min read • Published Jul 08, 2024 • Updated Jul 08, 2024
SpringJava
SNIPPET
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
You're reaching your limits. Your application is taking on massive amounts of data each day. More features, more interactions, everything you wanted your application to achieve, but you've reached a bottleneck. With an application with high throughput, sharding might be the solution you're looking for.
Sharding is a method for distributing a dataset across multiple machines. Databases with large enough datasets or read/write throughput can challenge the capacity of any one server. There are two methods to address this challenge: vertical scaling and horizontal scaling.
Vertical scaling is straightforward. Query rates are exhausting the CPU? Upgrade the CPU! My working sets have exceeded the size of the system's RAM? More RAM! While upgrading your server is a perfectly valid and oftentimes correct solution, there are limits. Hardware only gets so good, thanks to limits on the technology. And oftentimes, the increasing cost is exponential compared to the return on your investment, so you hit a point of diminishing returns.
Horizontal scaling means dividing your dataset and workload over multiple servers (such as through sharding). Each machine may only be as powerful as the last but only handles a subset of the overall workload, potentially providing better efficiency than one super-powered server. "Many hands make light work!"
The trade-off is an increased complexity in infrastructure but expanding capacity means only adding new servers as needed, and MongoDB Atlas reduces a lot of the complexity in managing sharded clusters.

Prerequisites

Set up MongoDB Atlas sharding

Create a sharded cluster, M30 or above. When creating your cluster, in additional settings, select Sharding and toggle to on. Set the number of shards to deploy with the sharded cluster. For production applications, use more than one shard or you will not be able to reap the full benefits of your sharded cluster, which you can read more about in the sharding documentation. This can be between 1 and 100 shards, inclusive. Wait for your cluster to be created. This will take a few minutes, so get a cup of coffee or scroll your TikToks. Once it is set up, select load the sample data, for this tutorial.
Spring Data MongoDB does not automatically set up sharding for collections. You need to perform these operations manually. We can do this using mongosh.
Connect to your MongoDB cluster as an admin with your connection string via mongosh:
1mongosh "mongodb+srv://<username>:<password>@<cluster-url>/admin"
Shard the users collection by email:
1sh.shardCollection("sample_mflix.users", { email: 1 })
You can verify that the sharding is enabled and the collection is sharded by running sh.status().

Define your entity with sharding in Spring Data

In your Spring Boot project, define an entity class and use the @Sharded annotation to specify the shard key fields.
Spring Data MongoDB uses the @Sharded annotation to identify entities stored in sharded collections.
1import org.springframework.data.annotation.Id;
2import org.springframework.data.mongodb.core.mapping.Document;
3import org.springframework.data.mongodb.core.mapping.Field;
4import org.springframework.data.mongodb.core.mapping.Sharded;
5
6@Document("users")
7@Sharded(shardKey = { "email" })
8public class User {
9
10 @Id
11 private String id;
12 private String name;
13 @Field("email")
14 private String email;
15 private String password;
16
17 // Getters and Setters
18}
Adding the @Sharded annotation to an entity helps Spring Data MongoDB optimize operations in a sharded environment. It ensures that the shard key is included in replaceOne queries during upserts. Sometimes, this might need an extra check to find the current shard key value.

Create a MongoDB repository

Define a repository interface for your User entity by extending MongoRepository.
1import org.springframework.data.mongodb.repository.MongoRepository;
2
3import com.mongodb.sharded.model.User;
4
5public interface UserRepository extends MongoRepository<User, String> {
6}

Create a service layer

Create a service class to handle business logic and interact with the repository.
1import java.util.List;
2
3import org.springframework.beans.factory.annotation.Autowired;
4import org.springframework.stereotype.Service;
5
6import com.mongodb.sharded.model.User;
7import com.mongodb.sharded.repository.UserRepository;
8
9@Service
10public class UserService {
11
12 @Autowired
13 private UserRepository userRepository;
14
15 public List<User> getAllUsers() {
16 return userRepository.findAll();
17 }
18
19 public User saveUser(User user) {
20 return userRepository.save(user);
21 }
22}

Create a controller layer

Create a controller class to expose REST endpoints for the User entity.
1import org.springframework.beans.factory.annotation.Autowired;
2import org.springframework.web.bind.annotation.*;
3
4import java.util.List;
5
6@RestController
7@RequestMapping("/users")
8public class UserController {
9
10 @Autowired
11 private UserService userService;
12
13 @GetMapping
14 public List<User> getAllUsers() {
15 return userService.getAllUsers();
16 }
17
18 @PostMapping
19 public User createUser(@RequestBody User user) {
20 return userService.saveUser(user);
21 }
22}

Configure Spring Boot application

Ensure your Spring Boot application is configured to connect to MongoDB Atlas. In your application.properties or application.yml, provide the MongoDB connection URI.
1spring.data.mongodb.uri=mongodb+srv://<username>:<password>@<cluster-url>/myDatabase?retryWrites=true&w=majority
Replace <username>, <password>, and <cluster-url> with your MongoDB Atlas credentials and connection string.

Best practices

Choose a sharding key that ensures an even distribution of data. The choice of a sharding key is critical to the performance and scalability of your sharded cluster. An ideal sharding key should:
  • Have high cardinality: This means the key should have many unique values to ensure data is evenly distributed across shards. It is not necessary that a sharding key be entirely unique.
  • Provide even distribution: The key should distribute documents evenly across all shards to avoid hotspots where one shard handles more data or requests than others.
  • Support common queries: Choose a key that aligns with your most common query patterns to minimize query scatter and optimize performance.
For the users collection in the sample_mflix database, using the email field as the sharding key can be a good choice if:
  • Emails are unique and well-distributed.
  • Queries frequently filter or sort by email.

Conclusion

Sharding with MongoDB Atlas and Spring Boot allows your application to scale efficiently and handle high loads with improved performance and availability. By following this guide, you can set up a sharded database, develop microservices, and ensure your application is ready to grow and succeed in a demanding environment.
If you want to learn more about MongoDB and Java, read more on Developer Center, where you can learn how to deploy your sharded spring application as a microservice with Azure and Kubernetes. Or if you have any questions, head over to our MongoDB Community forums.
Top Comments in Forums
There are no comments on this article yet.
Start the Conversation

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Using AWS IAM Authentication with the MongoDB Connector for Apache Kafka


Jul 01, 2024 | 4 min read
Tutorial

Getting Started With MongoDB and Amazon Q Developer generative AI–powered coding assistant


Sep 25, 2024 | 3 min read
Quickstart

Java - Client Side Field Level Encryption


Mar 01, 2024 | 14 min read
Quickstart

Java - Mapping POJOs


Mar 01, 2024 | 5 min read
Table of Contents
  • Prerequisites