Hey @Dustin_Dier apologies for the challenges and delay here!

The issue that was resolved on Monday was for M0-M5, if you’re having trouble with an M30 on 7.0.2 this must be a separate issue.

Please reach out to me directly at benjamin.flast@mongodb.com or schedule some time directly with Calendly - Benjamin Flast so we can look into what’s going on. I’m sure we can get this sorted out quickly!

(If anyone else is having separate or different challenges please feel free to throw some time on my calendar!)

Best,
Ben

I am trying to query multiple Vector Embeddings with $unionWith

Code:

const documents = await collection
          .aggregate([
            {
              $unionWith: {
                coll: "SearchLeads",
                pipeline: [
                  {
                    $vectorSearch: {
                      queryVector: embedding,
                      path: "industry_embedding",
                      numCandidates: 10000,
                      limit: 30,
                      index: "default",
                    },
                  },
                  {
                    $vectorSearch: {
                      queryVector: embedding,
                      path: "city_embedding",
                      numCandidates: 10000,
                      limit: 30,
                      index: "default",
                    },
                  },
                ],
              },
            },
          ])
          .toArray();

        return documents;

and I am getting this error while searching $vectorSearch is not allowed within a $unionWith’s sub-pipeline reference.

Cluster version: 6.0.13
Tier: M10 (General)

Hi Pratham,

I’m sorry you ran into this issue. We don’t currently support running $vectorSearch within a subpipeline, as noted in the limitations section of our documentation. I believe that previous comment I left that you linked to was mistaken, and have provided a clarification on that thread.

Since this is being actively worked on, I will provide an update directly on this thread once this is supported. In the meantime you should be able to achieve similar behavior by using $search with the knnBeta operator in a subpipeline instead (note that it has different syntax from the $vectorSearch stage but very similar functionality under the hood).

Sorry again for the confusion and inconvenience here, and please let me know if I can help with any other questions you have.

2 Likes

Hi Henry,

Thanks for your help. I am trying to use $search with the knnBeta operator in a sub-pipeline instead with the following syntax

const documents = await collection
          .aggregate([
            {
              $unionWith: {
                coll: "SearchLeads",
                pipeline: [
                  {
                    $search: {
                      index: "default",
                      knnBeta: {
                        vector: embedding,
                        path: "city_embedding",
                        k: 150,
                      },
                    },
                  },
                  {
                    $limit: 50,
                  },
                  {
                    $search: {
                      index: "default",
                      knnBeta: {
                        vector: embedding,
                        path: "industry_embedding",
                        k: 150,
                      },
                    },
                  },
                  {
                    $limit: 50,
                  },
                ],
              },
            },
          ])
          .toArray();

But I am getting the following error:

$_internalSearchMongotRemote is only valid as the first stage in a pipeline

What I want to do:
I have created two embedding in my SearchLeads collection industry_embedding and city_embedding. I want to match the results that contain both industry and city based on the query. How can I achieve this? Appreciate your help.

Hi Pratham,

Thank you for sharing. You must have $search as the first stage in the pipeline, even if executed within a subpipeline. The fix I would suggest for you is to have a first $search stage, then execute a $unionWith which executes the second $search stage as a subpipeline:

const documents = await collection
          .aggregate([
            {
              $search: {
                index: "default",
                knnBeta: {
                  vector: embedding,
                  path: "city_embedding",
                  k: 150,
                },
              },
            },
            {
              $limit: 50,
            },
            {
              $unionWith: {
                coll: "SearchLeads",
                pipeline: [
                  {
                    $search: {
                      index: "default",
                      knnBeta: {
                        vector: embedding,
                        path: "industry_embedding",
                        k: 150,
                      },
                    },
                  },
                  {
                    $limit: 50,
                  },
                ],
              },
            },
          ]).toArray();

Hope this helps and sorry for the delay in response!