Elasticsearch Aggregation by Day of Week and Hour of Day Elasticsearch Aggregation by Day of Week and Hour of Day elasticsearch elasticsearch

Elasticsearch Aggregation by Day of Week and Hour of Day


Re-post from my answer here: https://stackoverflow.com/a/31851896/6247

Does this help:

"aggregations": {    "timeslice": {        "histogram": {            "script": "doc['timestamp'].value.getHourOfDay()",            "interval": 1,            "min_doc_count": 0,            "extended_bounds": {                "min": 0,                "max": 23            },            "order": {                "_key": "desc"            }        }    }

This is nice, as it'll also include any hours with zero results, and, it'll extend the results to cover the entire 24 hour period (due to the extended_bounds).

You can use 'getDayOfWeek', 'getHourOfDay', ... (see 'Joda time' for more).

This is great for hours, but for days/months it'll give you a number rather than the month name. To work around, you can get the timeslot as a string - but, this won't work with the extended bounds approach, so you may have empty results (i.e. [Mon, Tues, Fri, Sun]).

In-case you want that, it is here:

"aggregations": {    "dayOfWeek": {        "terms": {            "script": "doc['timestamp'].value.getDayOfWeek().getAsText()",            "order": {                "_term": "asc"            }        }    }

Even if this doesn't help you, hopefully someone else will find it and benefit from it.


The same kind of problem has been solved in this thread.

Adapting the solution to your problem, we need to make a script to convert the date into the hour of day and day of week:

Date date = new Date(doc['created_time'].value) ; java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE, HH');format.format(date)

And use it in a query:

{    "aggs": {        "perWeekDay": {            "terms": {                "script": "Date date = new Date(doc['created_time'].value) ;java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE, HH');format.format(date)"            }        }    }}


The simplest way would be to define a dedicated day-of-week field that holds only the day of the week for each document, then do a terms aggregation on that field.

If for whatever reason you don't want to do that (or can't), here is a hack that might help you get what you want. The basic idea is to define a "date.raw" sub-field that is a string, analyzed with the standard analyzer so that terms are created for each day of the week. Then you can aggregate on those terms to get your counts, using include to only include the terms you want.

Here is the mapping I used for testing:

PUT /test_index{   "settings": {      "number_of_shards": 1   },   "mappings": {      "doc": {         "properties": {            "msg": {               "type": "string"            },            "date": {               "type": "date",               "format": "E, dd MMM yyyy",               "fields": {                  "raw": {                     "type": "string"                  }               }            }         }      }   }}

and a few sample docs:

POST /test_index/_bulk{"index":{"_index":"test_index","_type":"doc","_id":1}}{"msg": "hello","date": "Wed, 11 Mar 2015"}{"index":{"_index":"test_index","_type":"doc","_id":2}}{"msg": "hello","date": "Tue, 10 Mar 2015"}{"index":{"_index":"test_index","_type":"doc","_id":3}}{"msg": "hello","date": "Mon, 09 Mar 2015"}{"index":{"_index":"test_index","_type":"doc","_id":4}}{"msg": "hello","date": "Wed, 04 Mar 2015"}

and the aggregation and results:

POST /test_index/_search?search_type=count{    "aggs":{        "docs_by_day":{            "terms":{                "field": "date.raw",                "include": "mon|tue|wed|thu|fri|sat|sun"            }        }    }}...{   "took": 2,   "timed_out": false,   "_shards": {      "total": 1,      "successful": 1,      "failed": 0   },   "hits": {      "total": 4,      "max_score": 0,      "hits": []   },   "aggregations": {      "docs_by_day": {         "buckets": [            {               "key": "wed",               "doc_count": 2            },            {               "key": "mon",               "doc_count": 1            },            {               "key": "tue",               "doc_count": 1            }         ]      }   }}

Here is the code all together:

http://sense.qbox.io/gist/0292ddf8a97b2d96bd234b787c7863a4bffb14c5