Docs Menu
Docs Home
/
MongoDB Atlas
/ / / /

How to Index String Fields

On this page

  • Review the Limitations for the string Type
  • Define the Index for the string Type
  • Configure string Field Properties
  • Try an Example for the string Type

You can use the Atlas Search string type to index string fields. You can use the Atlas Search phrase, queryString, span, text, wildcard, regex, and moreLikeThis operators to query fields indexed as the string type.

If you enable dynamic mappings, Atlas Search automatically indexes fields of type string. You can use the Visual Editor or the JSON Editor in the Atlas UI to index fields as the string type.

You can't use the Atlas Search string type to index fields for facet or autocomplete queries. You can't use the string type to index fields for sorting Atlas Search results. Instead, you must use static mappings to index the string fields as the following types:

  • stringFacet type to run a facet query on string fields. Note that Atlas Search doesn't dynamically index string fields for faceting.

  • autocomplete type to run autocomplete operator queries on string fields. Note that Atlas Search doesn't dynamically index string fields for autocompletion.

  • token type to sort the Atlas Search results by the string field. Atlas Search doesn't dynamically index string fields for sorting the results.

  • token type to find an exact match for queries using equals, in, and range operators. Atlas Search doesn't dynamically index string fields as token type for querying using these operators.

Important

Atlas Search won't index string fields where analyzer tokens exceed 32766 bytes in size. If using the keyword analyzer, string fields which exceed 32766 bytes will not be indexed.

To define the index for the string type, choose your preferred configuration method in the Atlas UI and then select the database and collection.

  1. Click Refine Your Index to configure your index.

  2. In the Field Mappings section, click Add Field to open the Add Field Mapping window.

  3. Click Customized Configuration.

  4. Select the field to index from the Field Name dropdown.

    Note

    You can't index fields that contain the dollar ($) sign at the start of the field name.

  5. Click the Data Type dropdown and select String.

  6. (Optional) Expand and configure the String Properties for the field. To learn more, see Configure string Field Properties.

  7. (Optional) Click Add Multi Field to configure the following alternate analyzer settings for that field:

    1. Enter a name for the alternate analyzer in the Multi Field Name field.

    2. Configure the string field properties for the alternate analyzer under Multi Field Properties. To learn more, see Configure string Field Properties.

    3. (Optional) Click Add Another Mult Field and repeat steps 1 and b to configure more analyzers for the field.

  8. Click Add.

The following is the JSON syntax for the string type. Replace the default index definition with the following. To learn more about the fields, see Field Properties.

{
"mappings": {
"dynamic": true|false,
"fields": {
"<field-name>": {
"type": "string",
"analyzer": "<atlas-search-analyzer>",
"searchAnalyzer": "<atlas-search-analyzer>",
"indexOptions": "docs|freqs|positions|offsets",
"store": true|false,
"ignoreAbove": <integer>,
"multi": {<string-field-definition>},
"norms": "include|omit"
}
}
}
}

The Atlas Search string type takes the following parameters:

Option
Type
Necessity
Description
Default
type
string
Required
Human-readable label that identifies this field type. Value must be string.
analyzer
string
Optional

Name of a built-in or custom analyzer to use for indexing the field. If you don't specify a value, inherits an analyzer by default in the following order:

  1. The analyzer option for the index if specified.

  2. The lucene.standard analyzer.

searchAnalyzer
string
Optional

Analyzer to use when querying the field. If you don't specify a value, inherits an analyzer by default in the following order:

  1. The analyzer option for this field if specified.

  2. The searchAnalyzer option for the index if specified.

  3. The analyzer option for the index if specified.

  4. The lucene.standard analyzer.

indexOptions
string
Optional

Amount of information to store for the indexed field. Value can be one of the following:

  • docs - Only indexes documents. The frequency and position of the indexed term are ignored. Only a single occurence of the term is reflected in the score.

  • freqs - Only indexes documents and term frequency. The position of the indexed term is ignored.

  • positions - Indexes documents, term frequency, and term positions.

  • offsets - (Default) Indexes documents, term frequency, term positions, and term offsets. This option is required for highlight.

offsets
store
boolean
Optional

Flag that indicates whether to store the exact document text as well as the analyzed values in the index. Value can be true or false. The value for this option must be true for highlight.

To reduce the index size and performance footprint, we recommend setting store to false. To learn more, see Atlas Search Index Performance.

true
ignoreAbove
int
Optional
Maximum number of characters in the value of the field to index. Atlas Search doesn't index if the field value is greater than the specified number of characters.
multi
String Field Definition
Optional
String field to index with the name of the alternate analyzer specified in the multi object. To learn more about specifying the multi object, see Multi Analyzer and example below.
norms
string
Optional

String that specifies whether to include or omit the field length in the result when scoring. The length of the field is determined by the number of tokens produced by the analyzer for the field. Value can be one of the following:

  • include - to include the field length when scoring.

  • omit - to omit the field length when scoring.

If value is include, Atlas Search uses the length of the field to determine the higher score when scoring. For example, if two documents match an Atlas Search query, the document with the shorter field length scores higher than the document with the longer field length.

If value is omit, Atlas Search ignores the field length when scoring.

include

The following index definition example uses the sample_mflix.movies collection. If you have the sample data already loaded on your cluster, you can use the Visual Editor or JSON Editor in the Atlas UI to configure the index. After you select your preferred configuration method, select the database and collection, and refine your index to add field mappings.

The following index definition indexes string values in the title field as Atlas Search string type:

  1. In the Add Field Mapping window, select title from the Field Name dropdown.

  2. Click the Data Type dropdown and select String.

  3. Review the default settings for the String Properties.

  4. Click Add.

Replace the default index definition with the following index definition.

{
"mappings": {
"dynamic": false,
"fields": {
"title": {
"type": "string"
}
}
}
}

The following index definition indexes string values in the fullplot field with the lucene.english and lucene.french analyzers in addition to the default lucene.standard analyzer:

  1. In the Add Field Mapping window, select fullplot from the Field Name dropdown.

  2. Click the Data Type dropdown and select String.

  3. Review the default settings for the String Properties.

  4. Click Add Multi Field, enter english in the Multi Field Name field, and configure the following Multi Field Properties:

    Index Analyzer
    Select lucene.english under lucene.language.
    Search Analyzer
    Select lucene.english under lucene.language.
  5. Click Add Another Multi Field, enter french in the Multi Field Name field, and configure the following Multi Field Properties:

    Index Analyzer
    Select lucene.french under lucene.language.
    Search Analyzer
    Select lucene.french under lucene. language.
  6. Click Add.

Replace the default index definition with the following index definition.

{
"mappings": {
"dynamic": false,
"fields": {
"fullplot": {
"type": "string",
"multi": {
"english": {
"type": "string",
"analyzer": "lucene.english"
},
"french": {
"type": "string",
"analyzer": "lucene.french"
}
}
}
}
}
}

Back

objectId