Pivotal Cloud Foundry Metrics- "Elasticsearch master reports indices with red status"

search cancel

Pivotal Cloud Foundry Metrics- "Elasticsearch master reports indices with red status"

book

Article ID: 293535

calendar_today

Updated On:

Products

Operations Manager

Issue/Introduction

Symptoms:

After upgrading PCF Metrics tile from 1.2 to 1.3.x, smoke-tests fail.

`curl localhost:9200/_cluster/health?pretty` reports cluster health with status red

Error Message:

Elasticsearch flow
/tmp/build/8bf04831/metrics-app-dev-release-bumped/src/github.com/pivotal-cf/metrics-data/cmd/smoke_tests/elasticsearch_test.go:52
Ingests logs from firehose into elasticsearch [It]
/tmp/build/8bf04831/metrics-app-dev-release-bumped/src/github.com/pivotal-cf/metrics-data/cmd/smoke_tests/elasticsearch_test.go:51
Never received app logs - something in the firehose -> elasticsearch flow is broken

Summarizing 1 Failure:
[Fail] Elasticsearch flow [It] Ingests logs from firehose into elasticsearch 
/tmp/build/8bf04831/metrics-app-dev-release-bumped/src/github.com/pivotal-cf/metrics-data/cmd/smoke_tests/elasticsearch_test.go:50

Environment

Cause

The elastic search indexes were created and when they were trying to replicate the upgrade happened and left them in a corrupt state.

Resolution

The solution in this KB is a last resort if you are unable to fix elastic search master "red" status by restarting the app and other steps outlined in https://docs.pivotal.io/pcf-metrics/1-3/troubleshooting.html#smoke-test

DO NOT perform this procedure if many or all of the indices are in "red" status. This procedure is meant to address condition where a few indices are corrupt and stuck in "red" status.

Perform the steps:

1.) SSH to elasticsearch_master node:

$ bosh ssh elasticsearch_master/0

2.) Identify the indices with status red:

$ curl localhost:9200/_cat/indices?v | sort

green open app_logs_1504677600 1 1 209948 0 35.8mb 17.9mb
green open app_logs_1504699200 1 1 0 0 318b 159b
green open app_logs_1504785600 1 1 0 0 318b 159b
green open app_logs_1504807200 1 1 0 0 318b 159b

health status index pri rep docs.count docs.deleted store.size pri.store.size

red open app_logs_1504720800 1 1
red open app_logs_1504742400 1 1
red open app_logs_1504764000 1

3.) Delete the indices with status red:

$ curl -XDELETE http://localhost:9200/app_logs_1504720800

Note: This has potential to delete application log data. Do not execute if this logging is critical.

Feedback

thumb_up Yes

thumb_down No