Mongo Atlas search by text not working in indexed collection

Roland_I · 2025-04-15T06:03:59.454Z

I want to support two types of text searches:

Regex search (word.*)
Partial match search (prv would found prvln and etc)

I have this index schema:

{
  "analyzer": "lucene.standard",
  "mappings": {
    "dynamic": false,
    "fields": {
      "source": {
        "analyzer": "standardLowerer",
        "multi": {
          "keywordAnalyzer": {
            "analyzer": "keywordLowerer",
            "type": "string"
          }
        },
        "type": "string"
      }
    }
  },
  "analyzers": [
    {
      "charFilters": [
        {
          "ignoredTags": [],
          "type": "htmlStrip"
        },
        {
          "mappings": {
            "\"": " ",
            "'": " ",
            "`": " ",
            "‘": " ",
            "’": " ",
            "“": " ",
            "”": " ",
            "„": " "
          },
          "type": "mapping"
        }
      ],
      "name": "standardLowerer",
      "tokenFilters": [
        {
          "type": "lowercase"
        }
      ],
      "tokenizer": {
        "type": "standard"
      }
    },
    {
      "charFilters": [
        {
          "ignoredTags": [],
          "type": "htmlStrip"
        },
        {
          "mappings": {
            "\"": " ",
            "'": " ",
            "`": " ",
            "‘": " ",
            "’": " ",
            "“": " ",
            "”": " ",
            "„": " "
          },
          "type": "mapping"
        }
      ],
      "name": "keywordLowerer",
      "tokenFilters": [
        {
          "type": "lowercase"
        }
      ],
      "tokenizer": {
        "type": "keyword"
      }
    }
  ],
  "synonyms": []
}

The problem here is that it works but finds only by a full word. Any suggestions on what should I update in my index?

Erik_Hatcher · 2025-04-15T14:32:40.868Z

With that configuration, you can do prefix/starts-with wildcard queries like this playground shows.

If you wanted to index the starts-with partials, consider using the edgeGram tokenizer as an index analyzer, but when doing so and querying by prefixes, use a searchAnalyzer that is keyword-based.

amyjian · 2025-04-15T14:35:01.212Z

Hi @Roland_I , can you share the query you are running? If you haven’t already seen this tutorial, there are a few ways you can run partial matches in Atlas Search. I created a sample of what this could look like by using the autocomplete index type and operator in the Search Playground.

Let me know if this helps!

Roland_I · 2025-04-28T13:28:25.726Z

Hello, thanks for reply. Do I understand correctly that edgeGram indexing would be faster than wildcard queries on big datasets (10M+ docs)?