Deployment considerations and recommendationsAn Elasticsearch cluster is formed comprising all the available Data Node appliances, to support storing NTA records and making them available for search. The Elasticsearch cluster automatically balances the load across the nodes, by redistributing data and routing queries.
About cluster size:
One-node cluster
- If the cluster consists of one node, that single node must do everything.
- A single node cluster is not resilient.
- If the node fails, the cluster will stop working.
- Because there are no replicas in a one-node cluster, you cannot store your data redundantly.
- Because they are not resilient to any failures, we do not recommend using one-node clusters in production.
Two-node cluster
- The client requests will be balanced across both nodes in the cluster.
- Two nodes are required for a master election, however, the election will fail if either node is unavailable, therefore the cluster cannot reliably tolerate the loss of either node.
Because it’s not resilient to failures and the fail of one node can lead to a "split brain" situation, we do not recommend deploying a two-node cluster in production.
You might expect that if either node fails then Elasticsearch can elect the remaining node as the master, but
it is impossible to tell the difference between the failure of a remote node and a mere loss of connectivity
between the nodes. If both nodes were capable of running independent elections, a loss of connectivity
would lead to a split-brain problem and therefore data loss. Elasticsearch avoids this and protects the
data by electing neither node as master until that node can be sure that it has the latest cluster state
and that there is no other master in the cluster. This could result in the cluster having no master until
connectivity is restored.
Having no master makes the cluster non operational.
Three-node cluster
- Each node is master-eligible so that any two of them can hold a master election without needing to communicate with the third node.
- This cluster will be resilient to the loss of any single node.
That is the reason we recommend that any production deployment should default to 3 Data Nodes.
About hardware specifications:
- For virtualized appliances in VMware ESXi we recommend 2 x 1TB disks:
https://user.lastline.com/install-manuals/Data_Node_Installation_Manual.html#esxiinstallation
- For physical appliances we recommend adding 4 x 2TB disks:
https://user.lastline.com/install-manuals/Data_Node_Installation_Manual.html#hardware
- Install 2 or 3 Data Nodes is recommended to balance the storage load and have resiliency, according to our deployment considerations:
https://user.lastline.com/install-manuals/Data_Node_Installation_Manual.html#aboutdatnodeAdditional references:
https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-small-clusters.htmlhttps://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html