Bulk Deleting Records in Elasticsearch

Oct 9, 2019

Introduction

Having covered deleting records matching a query previously, let us now look at bulk delete using the id. Individual records can easily be deleted by specifying the id as follows. Here we are deleting the record with the id of 1835.

curl -XDELETE "localhost:9200/globallandtemperatures_globaltemperatures.csv/doc/1835"

This method does not, however, scale for deleting multiple records. It is too cumbersome and inefficient to repeat this request for many records.

Bulk Delete

The other way to delete multiple records by id is to use the bulk request. Here, you specify the id of the records to be deleted in a format known as ND-JSON (or newline-delimited-json). It looks like this:

{"delete":{"_type":"doc","_id":1837}}
{"delete":{"_type":"doc","_id":1842}}
{"delete":{"_type":"doc","_id":1847}}

This request specifies multiple deletes, each with an id specified. Each delete request must be separated from the next by a newline (or carriage-return-newline pair), and the whole request must be terminated by a newline. Since there is no inherent limit to the number of records that can be specified for deletion, this is a far more efficient way of deleting records in bulk.

The command for performing the delete is executed as follows. First you have to specify the content type as application/x-ndjson. Next the input must be specified using the --data-binary option of cURL. This is so the final new-line is left intact and helps to terminate the input.

curl -H "Content-Type: application/x-ndjson" --data-binary @deleteById.json -XPOST "localhost:9200/globallandtemperatures_globaltemperatures.csv/_bulk"

The response is something like this.

{
  "errors": false,
  "took": 208,
  "items": [
    {
      "delete": {
        "status": 404,
        "_type": "doc",
        "_seq_no": 3215,
        "_shards": {
          "successful": 1,
          "failed": 0,
          "total": 1
        },
        "_index": "globallandtemperatures_globaltemperatures.csv",
        "_primary_term": 1,
        "_version": 1,
        "result": "not_found",
        "_id": "1837"
      }
    },
    {
      "delete": {
        "status": 200,
        "_type": "doc",
        "_seq_no": 3216,
        "_shards": {
          "successful": 1,
          "failed": 0,
          "total": 1
        },
        "_index": "globallandtemperatures_globaltemperatures.csv",
        "_primary_term": 1,
        "_version": 2,
        "result": "deleted",
        "_id": "1838"
      }
    },
...

Note that errors is set to false even though not all records were deleted. This indicates that the request completed without errors.

You can find the status of each delete request by looking at the rest of the response. For example, here is a part which shows delete was accomplished. The status is set to 200, and the result to deleted.

...
    {
      "delete": {
        "status": 200,
        "_type": "doc",
        "_seq_no": 3216,
        "_shards": {
          "successful": 1,
          "failed": 0,
          "total": 1
        },
        "_index": "globallandtemperatures_globaltemperatures.csv",
        "_primary_term": 1,
        "_version": 2,
        "result": "deleted",
        "_id": "1838"
      }
    },
...

And here is a delete request which failed because the record with the specified id was not not present. Here the status is 404, and the result is not_found.

...
    {
      "delete": {
        "status": 404,
        "_type": "doc",
        "_seq_no": 3215,
        "_shards": {
          "successful": 1,
          "failed": 0,
          "total": 1
        },
        "_index": "globallandtemperatures_globaltemperatures.csv",
        "_primary_term": 1,
        "_version": 1,
        "result": "not_found",
        "_id": "1837"
      }
    },
...

Deleting records using Argon

Argon uses the bulk-delete request to delete records from Elasticsearch when doing a Cut or a Delete operation. Here is how it is performed.

1. Load the data in the Explorer View, and select records to be deleted.

Hold down the Control key to select non-contiguous records.

(click for larger image)

2. Right-click for context menu and select Cut or Delete.

(click for larger image)

3. Confirm that you indeed want to delete, and the records will be deleted.

(click for larger image)

4. If you choose Cut (or Copy) you can paste the records as CSV into a text editor.

(click for larger image)