Exact Text Query in Elasticsearch

Aug 29, 2019

Performing an exact text search in Elasticsearch is a bit tricky. One of the recommended ways to search a field for text is to use a match query as shown below (searching for “Africa”).

{
  "query": {
    "bool": {
      "must": [
        {
          "match_phrase": {
            "Area": "africa"
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 25
}

This search finds matches for “Africa” as expected. However, it also finds these matches.

...
{
  "Element Code": 7231,
  "Area": "Eastern Africa",
  "Value": 29633.4361
},
{
  "Element Code": 7231,
  "Area": "Eastern Africa",
  "Value": 32236.7717
},
...
...
{
  "Element Code": 7231,
  "Area": "Western Africa",
  "Value": 90869.5537
},
{
  "Element Code": 7231,
  "Area": "Western Africa",
  "Value": 97677.0082
},
...
...
{
  "Element Code": 7231,
  "Area": "Southern Africa",
  "Value": 239990.2601
},
{
  "Element Code": 7231,
  "Area": "Southern Africa",
  "Value": 234594.409
},
...

Not a good situation when you are looking exactly for “Africa” only, is it?

The reason this happens is because match query analyses the text before performing the search. What this means is the text is converted into tokens (or terms), lowercase the token (for the english analyzer) and applying transformations such as removing frequent stop words and reducing the tokens to their word stems (e.g. foxes -> fox, jumped -> jump, etc). Depending on how the document was indexed, the search may or may not find your document at all.

Using a Term Query

One solution for this performing an exact text search is to use a term query.

But note the warning on the term query page not to use term query for text searches. What we are going to do is that we perform a keyword search on the field (a suffix “.keyword” is attached to the field name).

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "Area.keyword": "Africa"
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 25
}

Checking the Mapping

Of course, this depends on how the document has been indexed. If you have used dynamic mapping, that means string data will be stored both as fulltext as well as keyword to allow both search types. You can check what the mapping type for the field is:

curl -X GET "localhost:9200/emissions/_mapping?pretty"

which shows the mapping type for the field of interest Area: The field is stored both as text and keyword with the characters above 256 ignored.

{
  ...
  "Area" : {
    "type" : "text",
    "fields" : {
      "keyword" : {
        "type" : "keyword",
        "ignore_above" : 256
      }
    }
  },
  ...
}

And when we perform the term query with the field as keyword, we get the exact matches we are looking for - only Africa and nothing else.

...
{
  "Element Code": 7264,
  "Area": "Africa",
  "Value": 2.5281
},
{
  "Element Code": 7264,
  "Area": "Africa",
  "Value": 2.7445
},
...

Conclusion

Use a term query when you need an exact text search like in SQL.

Try It Yourself with Argon

  1. Embedded Elasticsearch Argon includes a recent version of Elasticsearch, so when you download and install it, an Elasticsearch database is ready.

  2. Easy CSV Import Easily import CSV with automatic field type recognition. No problem importing large files.

  3. One Click Execution Argon provides a way to run both match query and term query on a field.

  4. Made for non-techies Intuitive UI with drag and drop. No coding required.