Bring Sharding to Your Spring Boot App with Spring Data MongoDB
Rate this tutorial
You're reaching your limits. Your application is taking on massive amounts of data each day. More features, more interactions, everything you wanted your application to achieve, but you've reached a bottleneck. With an application with high throughput, sharding might be the solution you're looking for.
Sharding is a method for distributing a dataset across multiple machines. Databases with large enough datasets or read/write throughput can challenge the capacity of any one server. There are two methods to address this challenge: vertical scaling and horizontal scaling.
Vertical scaling is straightforward. Query rates are exhausting the CPU? Upgrade the CPU! My working sets have exceeded the size of the system's RAM? More RAM! While upgrading your server is a perfectly valid and oftentimes correct solution, there are limits. Hardware only gets so good, thanks to limits on the technology. And oftentimes, the increasing cost is exponential compared to the return on your investment, so you hit a point of diminishing returns.
Horizontal scaling means dividing your dataset and workload over multiple servers (such as through sharding). Each machine may only be as powerful as the last but only handles a subset of the overall workload, potentially providing better efficiency than one super-powered server. "Many hands make light work!"
The trade-off is an increased complexity in infrastructure but expanding capacity means only adding new servers as needed, and MongoDB Atlas reduces a lot of the complexity in managing sharded clusters.
- A Spring Boot Project with
- Spring Data MongoDB
- Spring Web
- You can set this up using Spring Initializr
Create a sharded cluster, M30 or above. When creating your cluster, in additional settings, select Sharding and toggle to on. Set the number of shards to deploy with the sharded cluster. For production applications, use more than one shard or you will not be able to reap the full benefits of your sharded cluster, which you can read more about in the sharding documentation. This can be between 1 and 100 shards, inclusive. Wait for your cluster to be created. This will take a few minutes, so get a cup of coffee or scroll your TikToks. Once it is set up, select load the sample data, for this tutorial.
Spring Data MongoDB does not automatically set up sharding for collections. You need to perform these operations manually. We can do this using
mongosh
.Connect to your MongoDB cluster as an admin with your connection string via
mongosh
:1 mongosh "mongodb+srv://<username>:<password>@<cluster-url>/admin"
Shard the users collection by email:
1 sh.shardCollection("sample_mflix.users", { email: 1 })
You can verify that the sharding is enabled and the collection is sharded by running
sh.status()
.In your Spring Boot project, define an entity class and use the
@Sharded
annotation to specify the shard key fields.Spring Data MongoDB uses the
@Sharded
annotation to identify entities stored in sharded collections.1 import org.springframework.data.annotation.Id; 2 import org.springframework.data.mongodb.core.mapping.Document; 3 import org.springframework.data.mongodb.core.mapping.Field; 4 import org.springframework.data.mongodb.core.mapping.Sharded; 5 6 7 8 public class User { 9 10 11 private String id; 12 private String name; 13 14 private String email; 15 private String password; 16 17 // Getters and Setters 18 }
Adding the
@Sharded
annotation to an entity helps Spring Data MongoDB optimize operations in a sharded environment. It ensures that the shard key is included in replaceOne
queries during upserts. Sometimes, this might need an extra check to find the current shard key value.Define a repository interface for your
User
entity by extending MongoRepository
.1 import org.springframework.data.mongodb.repository.MongoRepository; 2 3 import com.mongodb.sharded.model.User; 4 5 public interface UserRepository extends MongoRepository<User, String> { 6 }
Create a service class to handle business logic and interact with the repository.
1 import java.util.List; 2 3 import org.springframework.beans.factory.annotation.Autowired; 4 import org.springframework.stereotype.Service; 5 6 import com.mongodb.sharded.model.User; 7 import com.mongodb.sharded.repository.UserRepository; 8 9 10 public class UserService { 11 12 13 private UserRepository userRepository; 14 15 public List<User> getAllUsers() { 16 return userRepository.findAll(); 17 } 18 19 public User saveUser(User user) { 20 return userRepository.save(user); 21 } 22 }
Create a controller class to expose REST endpoints for the
User
entity.1 import org.springframework.beans.factory.annotation.Autowired; 2 import org.springframework.web.bind.annotation.*; 3 4 import java.util.List; 5 6 7 8 public class UserController { 9 10 11 private UserService userService; 12 13 14 public List<User> getAllUsers() { 15 return userService.getAllUsers(); 16 } 17 18 19 public User createUser( User user) { 20 return userService.saveUser(user); 21 } 22 }
Ensure your Spring Boot application is configured to connect to MongoDB Atlas. In your application.properties or
application.yml
, provide the MongoDB connection URI.1 spring.data.mongodb.uri=mongodb+srv://<username>:<password>@<cluster-url>/myDatabase?retryWrites=true&w=majority
Replace
<username>
, <password>
, and <cluster-url>
with your MongoDB Atlas credentials and connection string.Choose a sharding key that ensures an even distribution of data. The choice of a sharding key is critical to the performance and scalability of your sharded cluster. An ideal sharding key should:
- Have high cardinality: This means the key should have many unique values to ensure data is evenly distributed across shards. It is not necessary that a sharding key be entirely unique.
- Provide even distribution: The key should distribute documents evenly across all shards to avoid hotspots where one shard handles more data or requests than others.
- Support common queries: Choose a key that aligns with your most common query patterns to minimize query scatter and optimize performance.
For the users collection in the
sample_mflix
database, using the email field as the sharding key can be a good choice if:- Emails are unique and well-distributed.
- Queries frequently filter or sort by email.
Sharding with MongoDB Atlas and Spring Boot allows your application to scale efficiently and handle high loads with improved performance and availability. By following this guide, you can set up a sharded database, develop microservices, and ensure your application is ready to grow and succeed in a demanding environment.
If you want to learn more about MongoDB and Java, read more on Developer Center, where you can learn how to deploy your sharded spring application as a microservice with Azure and Kubernetes. Or if you have any questions, head over to our MongoDB Community forums.
Top Comments in Forums
There are no comments on this article yet.