Elasticsearch Numeric Range Query

Sep 15, 2019

Introduction

A range query, as the name indicates, is used to search an integer or floating-point field for a range of values. These values are expressed as one or more conditions such as less than, less than or equal to, greater than or greater than or equal to a given value. Elasticsearch then locates the documents that match the conditions.

Greater Than Range Query

Here is an example query which locates documents where the diabetes prevalence rate of a country is greates than 3.5%. Here the field named 2017 holds the value of diabetes prevalence during 2017.

{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "2017": {
              "gt": "3.5"
            }
          }
        }
      ]
    }
  }
}

A part of the response is shown below;

{
  "hits": {
    "hits": [
      {
        "_score": 1.0,
        "_type": "doc",
        "_id": "1",
        "_source": {
          "1960": "",
          "Country Code": "ABW",
          "Indicator Code": "SH.STA.DIAB.ZS",
          "Country Name": "Aruba",
          "Indicator Name": "Diabetes prevalence (% of population ages 20 to 79)",
          "2017": 11.62
        },
        "_index": "diabetes-prevalence.csv"
      },
      {
        "_score": 1.0,
        "_type": "doc",
        "_id": "2",
        "_source": {
          "1960": "",
          "Country Code": "AFG",
          "Indicator Code": "SH.STA.DIAB.ZS",
          "Country Name": "Afghanistan",
          "Indicator Name": "Diabetes prevalence (% of population ages 20 to 79)",
          "2017": 9.59
        },
        "_index": "diabetes-prevalence.csv"
      },
...

Greater Than and Less Than Range Query

To find documents for a range bounded by a greater than and a less than, you can use a query of the form:

{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "2017": {
        "gt": "3.5",
        "lt": "4.0"
            }
          }
        }
      ]
    }
  }
}

Now the results are appropriately limited to the range specified.

  "hits": {
    "total": 8,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "diabetes-prevalence.csv",
        "_type": "doc",
        "_id": "3",
        "_score": 1.0,
        "_source": {
          "Country Name": "Angola",
          "Country Code": "AGO",
          "Indicator Name": "Diabetes prevalence (% of population ages 20 to 79)",
          "Indicator Code": "SH.STA.DIAB.ZS",
          "1960": "",
          "2017": 3.94
        }
      },
      {
        "_index": "diabetes-prevalence.csv",
        "_type": "doc",
        "_id": "140",
        "_score": 1.0,
        "_source": {
          "Country Name": "Lesotho",
          "Country Code": "LSO",
          "Indicator Name": "Diabetes prevalence (% of population ages 20 to 79)",
          "Indicator Code": "SH.STA.DIAB.ZS",
          "1960": "",
          "2017": 3.94
        }
      },
...

Range Aggregation

Related to the range query is the Range Aggregation which can used to determine the distribution of values in different ranges. The whole range of the value in question is divided into a small number of contiguous ranges, and the count aggregation is requested over these ranges.

The following example shows an example aggregation request of the distribution of diabetes prevalence rates:

{
  "from": 0,
  "aggs": {
    "value_count": {
      "value_count": {
        "field": "Country Name.keyword"
      }
    },
    "2017": {
      "aggs": {
        "value_count": {
          "value_count": {
            "field": "Country Name.keyword"
          }
        }
      },
      "range": {
        "field": "2017",
        "ranges": [
          {
            "key": "0 to 5",
            "from": 0.0,
            "to": 5.0
          },
          {
            "key": "5 to 10",
            "from": 5.0,
            "to": 10.0
          },
          {
            "key": "10 to 15",
            "from": 10.0,
            "to": 15.0
          },
          {
            "key": "15 to 20",
            "from": 15.0,
            "to": 20.0
          },
          {
            "key": "20 to 25",
            "from": 20.0,
            "to": 25.0
          },
          {
            "key": "25 to 30",
            "from": 25.0,
            "to": 30.0
          },
          {
            "key": "From 30",
            "from": 30.0
          }
        ]
      }
    },
    "2017/Country Name": {
      "aggs": {
        "Country Name": {
          "aggs": {
            "value_count": {
              "value_count": {
                "field": "Country Name.keyword"
              }
            }
          },
          "terms": {
            "field": "Country Name.keyword",
            "shard_size": 2147483647
          }
        }
      },
      "range": {
        "field": "2017",
        "ranges": [
          {
            "key": "0 to 5",
            "from": 0.0,
            "to": 5.0
          },
          {
            "key": "5 to 10",
            "from": 5.0,
            "to": 10.0
          },
          {
            "key": "10 to 15",
            "from": 10.0,
            "to": 15.0
          },
          {
            "key": "15 to 20",
            "from": 15.0,
            "to": 20.0
          },
          {
            "key": "20 to 25",
            "from": 20.0,
            "to": 25.0
          },
          {
            "key": "25 to 30",
            "from": 25.0,
            "to": 30.0
          },
          {
            "key": "From 30",
            "from": 30.0
          }
        ]
      }
    },
    "Country Name": {
      "aggs": {
        "value_count": {
          "value_count": {
            "field": "Country Name.keyword"
          }
        }
      },
      "terms": {
        "field": "Country Name.keyword",
        "shard_size": 2147483647
      }
    }
  },
  "size": 0
}

The result aggregation is shown partially:

  "aggregations": {
    "2017": {
      "buckets": [
        {
          "key": "0 to 5",
          "from": 0.0,
          "to": 5.0,
          "doc_count": 51,
          "value_count": {
            "value": 51
          }
        },
        {
          "key": "5 to 10",
          "from": 5.0,
          "to": 10.0,
          "doc_count": 130,
          "value_count": {
            "value": 130
          }
        },
...

Distribution Chart

These values can be plotted into a distribution chart. The x-axis shows than ranges, and the y-axis shows the frequency of occurence.

(click for larger image)

See Also

Date Range Query