Deploy a high-available Elasticsearch cluster

Deploying a high-availability Elasticsearch cluster is a crucial step for organizations that rely on Elasticsearch for their search and analytics needs.

Deploy a high-available Elasticsearch cluster

Introduction

Deploying a high-availability Elasticsearch cluster is a crucial step for organizations that rely on Elasticsearch for their search and analytics needs.

A highly available cluster ensures data reliability and system uptime even in the face of hardware failures or network issues.

In this tutorial, we'll guide you through the process of setting up a high-availability Elasticsearch cluster using three nodes for redundancy.

Prerequisites

Before you begin, ensure that you have the following prerequisites in place:

  1. Linux Servers: You'll need three Linux servers. These can be virtual machines or physical servers. For this tutorial, we'll refer to them as node1, node2, and node3.
  2. Elasticsearch Installation: Install Elasticsearch on each server. You can follow the official Elasticsearch documentation for your specific Linux distribution to do this.
  3. Network Configuration: Ensure that your servers can communicate with each other over the network. You may need to configure firewalls or security groups to allow Elasticsearch traffic between the nodes.
  4. Java: Elasticsearch requires Java to run. Install a compatible Java version (usually Java 8 or 11) on each server.
  5. Elasticsearch Configuration: Edit the Elasticsearch configuration file (elasticsearch.yml) on each node to specify cluster settings. You should set the cluster name and specify the IP address or hostname of each node.
Elasticsearch high availability is not working as expected in MDM 10.4 with  3 node cluster

Step 1: Configure Elasticsearch Cluster Settings

Edit the elasticsearch.yml configuration file on each node to configure the cluster settings. Here's a sample configuration for each node (node1, node2, and node3):

# node1 elasticsearch.yml
cluster.name: my-cluster
node.name: node1
network.host: node1-ip-address
discovery.seed_hosts: ["node1-ip-address", "node2-ip-address", "node3-ip-address"]

Repeat this configuration for node2 and node3, replacing the node.name, network.host, and discovery.seed_hosts values accordingly.

Step 2: Start Elasticsearch on Each Node

Start Elasticsearch on each node using the following command:

sudo service elasticsearch start

Ensure that Elasticsearch starts without errors on all three nodes.

Step 3: Verify Cluster Status

You can use the following command on any node to check the cluster health status:

 curl -X GET "http://node1-ip-address:9200/_cluster/health"

Replace node1-ip-address with the actual IP address or hostname of one of your nodes. You should see a response with the cluster health status, indicating that the cluster is up and running.

Step 4: Configure Shard and Replica Settings

By default, Elasticsearch indexes have five primary shards and one replica. To ensure high availability, you can adjust these settings based on your requirements. For example, you can set the number of replicas to two to have two copies of each shard. You can do this using the Elasticsearch index settings or by specifying it when creating an index.

Step 5: Monitor and Maintain the Cluster

Set up monitoring and alerting for your Elasticsearch cluster to proactively detect issues and ensure its ongoing health. Tools like Kibana can help you visualize and manage cluster performance.

Regularly update Elasticsearch and your operating system to patch security vulnerabilities and improve performance.

Conclusion

You've successfully deployed a high-availability Elasticsearch cluster with three nodes.

This setup provides redundancy and fault tolerance, ensuring that your Elasticsearch cluster remains available even in the face of hardware failures or other issues.

Be sure to monitor your cluster's health and keep your Elasticsearch and operating system up to date for ongoing reliability and security.