Roland_I
(Roland I )
1
I want to support two types of text searches:
- Regex search (
word.*)
- Partial match search (
prv would found prvln and etc)
I have this index schema:
{
"analyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"source": {
"analyzer": "standardLowerer",
"multi": {
"keywordAnalyzer": {
"analyzer": "keywordLowerer",
"type": "string"
}
},
"type": "string"
}
}
},
"analyzers": [
{
"charFilters": [
{
"ignoredTags": [],
"type": "htmlStrip"
},
{
"mappings": {
"\"": " ",
"'": " ",
"`": " ",
"‘": " ",
"’": " ",
"“": " ",
"”": " ",
"„": " "
},
"type": "mapping"
}
],
"name": "standardLowerer",
"tokenFilters": [
{
"type": "lowercase"
}
],
"tokenizer": {
"type": "standard"
}
},
{
"charFilters": [
{
"ignoredTags": [],
"type": "htmlStrip"
},
{
"mappings": {
"\"": " ",
"'": " ",
"`": " ",
"‘": " ",
"’": " ",
"“": " ",
"”": " ",
"„": " "
},
"type": "mapping"
}
],
"name": "keywordLowerer",
"tokenFilters": [
{
"type": "lowercase"
}
],
"tokenizer": {
"type": "keyword"
}
}
],
"synonyms": []
}
The problem here is that it works but finds only by a full word. Any suggestions on what should I update in my index?
With that configuration, you can do prefix/starts-with wildcard queries like this playground shows.
If you wanted to index the starts-with partials, consider using the edgeGram tokenizer as an index analyzer, but when doing so and querying by prefixes, use a searchAnalyzer that is keyword-based.
amyjian
(Amy)
3
Hi @Roland_I , can you share the query you are running? If you haven’t already seen this tutorial, there are a few ways you can run partial matches in Atlas Search. I created a sample of what this could look like by using the autocomplete index type and operator in the Search Playground.
Let me know if this helps!
Roland_I
(Roland I )
4
Hello, thanks for reply. Do I understand correctly that edgeGram indexing would be faster than wildcard queries on big datasets (10M+ docs)?