Building AI with MongoDB: How Metaphor Data Uses Atlas Vector Search to Change the World Through Data
Since announcing MongoDB Atlas Vector Search in preview back in June, we’ve already seen rapid adoption from developers building a wide range of AI-enabled apps. Today we're highlighting another customer who has increased efficiency while removing architectural complexity by adopting Atlas Vector Search.
Metaphor is a search and discovery tool built for data scientists, data engineers, and AI practitioners. The company’s mission is to empower individuals and companies of all types to change the world through data. Metaphor is the next evolution of the Data Catalog with fully automated support for Data Governance, Data Literacy, and Data Enablement using an intuitive user interface.
We recently caught up with Mars Lan, Co-founder and CTO to learn more about the company’s journey with MongoDB and their adoption of Atlas Vector Search.
Check out our AI resource page to learn more about building AI-powered apps with MongoDB.
Tell us a little bit about your company and what you and the team are building
We’re an early-stage startup with a mission to empower individuals and organizations to change the world through data. We refer to ourselves as the social platform for data and have a range of products that support both data teams but also data consumers. Our main product is a SaaS Data Catalog that enables governance and enablement of data across the organization. We’re a small team of around 15 or so with a keen focus on product and engineering. The company was founded about 2.5 years ago.
What role does search play at the company and where did your search story begin?
Well, I will start by saying that we almost ended up having a very different story to tell you than what actually ended up transpiring! We started off our journey using DocumentDB and Elasticsearch on AWS for our database and search needs. After some time we ran into some scalability issues that caused us to evaluate (and eventually move to) MongoDB Atlas for our database needs. When we saw MongoDB offered Atlas Search which was based on the same underlying Lucene technology we got very excited and began the process of migrating our search efforts over to Atlas — and this eventually laid the groundwork for adopting Atlas Vector Search later on.
So starting with those initial search needs, what got you excited about Atlas Search with MongoDB? What were your use cases?
We started to face a significant amount of maintenance and upkeep associated between our database and Elasticsearch. We previously had to build data pipelines, so if something changed in the database, it would also change in search. Once we eventually migrated everything to MongoDB Atlas Search, we no longer had to manage those pipelines. This resulted in lower latency and less likelihood of bugs, which excited our team.
The other component to this was the scalability disconnect of having two different systems. We realized if we ever needed to spin up more storage or compute, we could just spin up a larger MongoDB cluster and get that extra scalability right away with the Atlas platform. Of course one less thing to worry about is also a huge benefit — Elasticsearch is not the easiest thing to manage, so having it all in MongoDB was another big plus for us.
How did you initially learn about Atlas Vector Search and what piqued your interest?
We started experimenting with Pinecone as the AI stuff really started to explode a while back, just to try out the tool, as one of our interns had initially started playing around with it. It turns out not to be cost-effective to spin up a Pinecone instance for each customer, and quite difficult to scale up due to API throttling.
After some time, we started looking around for other vendors for vector search. However, once we learned that MongoDB had Vector Search we got excited at the prospect of being able to use our existing tech stack for this additional functionality. It quickly became a no-brainer to us — since we knew we were going to move everything to Atlas, it became obvious we should just consolidate everything there, so we ended up migrating to Atlas Vector Search for all of our semantic search needs. This means one query API, one set of dependencies, and build in sync, all in a single platform.
What were the key factors that made you pull the trigger and adopt Atlas Vector Search? What were the problems you were trying to solve?
So one key unlock for us was the semantic search side of things, where someone can ask a natural language question and get a natural language answer. This is a much more preferable user experience for us compared to your Google-style keyword searches.
From day one we always wanted to best serve our core customer the engineer, but another huge constituency for us is the business or non-technical audience. These folks prefer a tool that is more intuitive to use.
To best serve them we have a first-class integration into Slack and Microsoft Teams, so they can ask a question and don't have to go to another place or switch tools to get that answer. We didn’t always have the capability to do the natural language question and response, but with Atlas Vector Search this now becomes possible. Using Vector Search we now have the ability to ask the Slack bot questions like “where can I find this type of data” or “where is this one table on revenue from last quarter and who is using it” and get a natural language response back.
One of the key considerations for us when looking at vendors was cost - but not just cost in terms of what shows up on an invoice. I would rather scale one system and get benefits on both (search and vector search). We saw that having to scale two systems independently was just not going to be very efficient in the long run.
Can you talk about some of the initial benefits you’ve seen so far both on the Atlas Search side as well as with Vector Search specifically? How do you think about and quantify these benefits?
Well one obvious thing that stands out on the search side is increased speed and being able to move quickly. MongoDB in general has a great developer experience. Our data model tends to be highly complex documents, and all the metadata tends to be highly structured and complex, so the MongoDB model fits us very well.
In terms of productivity, it’s never an exact science. I will say that with the adoption of Atlas we were able to keep our engineering team size relatively constant while serving many more customers and scale our development efforts faster — so we probably saw a 2X - 3X increase in productivity.
One last item of note. We adopt the most rigorous security practices because we deal with so much customer data, so we want to ensure the highest security possible. We chose to have dedicated MongoDB clusters per customer, so every customer’s data is totally isolated from each other. When we were on Pinecone, this meant spinning up a new Pinecone pod for each customer, which would be both really hard to do and not at all financially viable. Because we are centralizing this all under MongoDB, it becomes so much easier - you can dynamically scale your cluster sizes up and down depending on the needs or requirements of small vs. large customers. There’s not the sort of waste you’d get with multiple discrete systems.
Getting started
A big thank you to Mars and the entire Metaphor Data team for sharing more about their story and use of Atlas Vector Search.
Want to learn more? Head over to our quick-start guide to get started with Atlas Vector Search today. And if you’re a startup building with AI please check out our MongoDB AI Innovators program for Atlas credits, one-on-one technical advice, access to our partner network, and more!