How to implement filters in MongoDB Atlas Vector Search (array)

Hi,

I have these documents in my collection:

{
     _id: ObjectId(),
    text: "Bye bye",
    embedding: Array,
    id: 1,
    tags: ["employee", "manager"]
}

I have this index:

{
        "name": "embeddingIndexTags",
        "definition": {
            "mappings": {
                "dynamic": True,
                "fields": {
                    "embedding": {
                        "dimensions": 768,
                        "similarity": "cosine",
                        "type": "knnVector"
                    },
                    "tags": [{"type": "token",},
                             {"type": "string",}]
                }
            }
        }
    }

I want to filter the document based on tags.
Using LangChain

retriever=mongo_vectorstore.as_retriever(
search_kwargs = {"k": 4,
"pre_filter": {"tags":{"$in": ["employee"]},
                  },
}
)

For some reason, it is not working correctly.
I get results even if I put a tag that doesn’t exist like [“admin”].

Hi @Simon_Prudhomme,

I presume you’re expecting 0 results to be returned when using the tag ["admin"] (assuming it doesn’t exist within any document) but correct me if I’m wrong here.

There is a similar post comment here : preFilter configuration and indexing for Langchain retriever - #4 by Prakul_Agarwal

The syntax appears a bit different to what you’ve posted. Could you provide the version of MongoDB + LangChain you are using?

You can also check via mongosh if the filter equivalent of your query works as expected. This might help narrow down the issue to langchain syntax etc.

Regards,
Jason

1 Like
                    "tags": [{"type": "token",},
                             {"type": "string",}]

@Simon_Prudhomme The index definition for tags is specifying token and string both, which could be the problem. The correct definition should be something on this line

        "name": "embeddingIndexTags",
        "definition": {
            "mappings": {
                "dynamic": True,
                "fields": {
                    "embedding": {
                        "dimensions": 768,
                        "similarity": "cosine",
                        "type": "knnVector"
                    },
                    "tags": {"type": "token"}
                }
            }
        }
    }

Recently new index definitions were rolled out https://mongodb.prakticum-team.ru/docs/atlas/atlas-vector-search/vector-search-type/#std-label-avs-types-vector-search

1 Like