What is Elasticsearch and how it works?


Elasticsearch is an open-source, distributed, multi-tenant-capable search engine. The term multi-tenant refers to the architecture in which a single software runs on a server and serve more than one user at the same time.

SQL database managements systems aren’t really designed for full-text searches also they will take around 10 sec to list out the results were as Elastic search will perform a full-text search and return the results within few milliseconds.

Elastic search is based on Lucene,which is a open source full-text search engine library written in Java.

Indexing and Retrieving data :

Indexing :

The act of storing data in Elasticsearch is called indexing. In Elastic search each data was stored in simple JSON document. Data can be stored using HTTP PUT request.

Elasticsearch cluster can contain multiple indices which are similar to the databases in SQL. Each indices may contain multiple types. These types hold multiple documents and each document has multiple fields.

PUT /index/type/Id

Example :

Lets imagine a school record of all students.

Here the Index (Indices) may be Class-11, Class-12

Type will be as Student. Here the each student data was stored with the ID of the particular student. The files stored under this type may contains fields like : Name, Register number, Father Name etc.

eg: PUT Class-11 / Student / 1

Retrieving :

Retrieving a document is easy in elastic search where we can simple execute “HTTP Get” request and specify the address of the document.

GET /index/type/Id

eg: Class-11 / Student / 1

Every feature of Elasticsearch is exposed as a REST API:

Index API: Used to document the index.

Get API: Used to retrieve the document.

Search API: Used to submit your query and get a result.

Put Mapping API: Used to override default choices and define the mapping.

Filtering Queries and Aggregations :

Elasticsearch has a single set of components called queries, which can be mixed and matched in endless combinations.

This single set of components can be used in two contexts:

  1. Filtering Context
  2. Query Context

When used in filtering context, the query is said to be a “non-scoring” or “filtering” query.

When used in a querying context, the query becomes a “scoring” query.

Non-Scoring Query is to filter the data which matches with the given query.

A scoring query calculates how relevant each document is to the query, and assigns it a relevance _score, which is later used to sort matching documents by relevance.

Aggregations :

The query returns a certain subset of documents, and the aggregation operates on those documents.