What is elastic search What is elastic search elasticsearch elasticsearch

What is elastic search


Elasticsearch is a database, however it's not a relational database like you may be used to. It is a NoSQL database.

You insert JSON documents into an index. You query that index to find documents that match a particular criterion.

It is also sharded and node distributed, which gives it resilience and scalability, and also - if you set it up right - performance.

This means it's really good at 'search engine' style database queries, but because it's not relational, it cannot do the equivalent of a SQL JOIN operation very easily.

One example use case is logstash and kibana - known as the ELK stack - where system event logs (syslog, httpd logs, that kind of thing) are processed by logstash to parse metadata - like log source, referrer, URL, session ID, etc. - and then inserted into elasticsearch.

As each event is a self contained piece of information, this is what elasticsearch does particularly well.

You can then use Kibana as a visualisation engine to display your logs, but also perform analysis - most hit pages, geographic distribution of requests, incoming referrers, time based distribution of requests, etc.

But it also collates these logs, so if you run a really large, geographically distributed website with multiple webserver nodes - or maybe you just have a lot of servers in your computer room and want to summarise the system logs - you can feed the whole lot into elastic search.

It's design is such that it's good at handling near-real-time data insertion and analysis. It also works quite well for 'forum style' data models, as essentially all you're doing is querying a list of posts with a particular forum name, and finding replies to a particular parent node - but they're standalone 'documents'.

So yes, you probably could use it to search an existing database, but you'll have to think about your data model - you can't just translate a conventional relational model, you would have to flatten it. Denormalisation is something of a sin in RDBMS terms, but it's actually quite good for search engines, because you can execute queries in parallel more efficiently.


Databases cannot be optimized for all use cases, but luckily there are many databases available so we can choose the best one for each task.

Elasticsearch is optimized for:

  • Filtering of documents (exact match)
  • Search ranking of documents (relevance of search terms)
  • Aggregation of results (sums, distinct counts, percentiles, ...)

Neo4j is optimized for:

  • Graph traversal (naturally)
  • High performance when operated on a "local" graph neighborhood (context)

Actually both databases use the same underlying library Lucene to "index" data to be searched later.