This article provides the steps to resolve the problem where the elasticsearch service on a Data Node that is part of a cluster will not start because the index and alias names are not unique.
Symptoms:
1. The automatic Lastline Test Appliance component check reports an error condition with the message "Unable to get Elasticsearch cluster status" in the Data Node appliance Monitoring Logs.
For the other scenario of "DanglingIndicesState", the elastisearch status reported yellow and you would see in output of lastline_test_appliance
output: > SOFTWARE:
output: > WARNING: The Elasticsearch cluster status is yellow. All primary shards are active, but not all replica shards are active: performance and reliability may be degraded.
This is observed in the monitoring log as well.
2. The elasticsearch log file /var/log/elasticsearch/lldns/lldns.log contains entries like below; make sure you use the scrolling feature to see the entire entry:
java.lang.IllegalStateException: index and alias names need to be unique, but the following duplicates were found [.kibana (alias of [.kibana_2/fD7MMzp-RhupZGr1nq3P-g])] at org.elasticsearch.cluster.metadata.MetaData$Builder.build(MetaData.java:1118) ~[elasticsearch-6.8.9.jar:6.8.9]
or
[WARN ][o.e.g.DanglingIndicesState] [sderums7632-lldns] [[.kibana-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6-reindexed-v6/rEtNOvZFRIWCgqtxpQPm7Q]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
Note: The Data Node itself may be in OK (green) state in the Appliance Overview page in the Portal.
A. Data Node Appliances
Execute the following steps on the Data Node you suspect to be affected by this issue, as the root user (via sudo su).
1. Determine if the system is in this state by searching for the error string in the elasticsearch log file.
2. In the output from the step above, take a note of the string following the "/
" character
(in our case, fD7MMzp-RhupZGr1nq3P-g
): it identifies a directory in the elasticsearch storage.
3. Move the directory identified in step 1 from the elasticsearch storage to the /tmp directory:
5. Identify the kibana index/indices
.kibana_1
Sample output for
DanglingIndicesState / duplicate indicesExecute command on the data node: curl -s localhost:9200/_cat/nodes
Sample output:
10.31.44.01 15 86 58 2.57 2.60 2.43 mdi * datanode01
10.31.44.02 15 86 58 2.57 2.60 2.43 mdi * datanode02
B. Manager Appliance
Execute the following steps on the Manager that controls the affected Data Nodes, as the root user (via sudo su).
If you have more than one Manager appliance in your account, you need to identify which Manager the Data Nodes belong to. You can copy the license for the Data Node and paste it into the Quick Search box above the Appliances page in your Portal. The Manager and the two Data Nodes that are associated with that Manager will appear in the list (assuming you have the minimum two Data Node cluster configured). This is the Manager you should execute the following steps on.
Workaround: None
Without the Elasticsearch service running, data will not be indexed by this Data Node.