AIOps - oimetricpublisher service reporting status as Failure after upgrade to 21.3.1
search cancel

AIOps - oimetricpublisher service reporting status as Failure after upgrade to 21.3.1

book

Article ID: 229125

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management

Issue/Introduction

After DX Platform upgrade to 21.3.1, the oimetricpublisher service status is reporting a Failure

Environment

DX OI and DX APM 20.2.x upgraded to 21.3.1

Cause

We don't use oimetricpublisher service anymore, this is problem is related to a defect in the upgrade process (DE521161).

Manual steps how to remove oimetricpublisher service(s) is documented below.

This defect will be fixed in next release.

Resolution

In k8s/openshift:

1) scale to 0 / remove the deployment for oimetripublisher

a) Find deployments named like apmservices-oimetricpublisher-*:

kubectl get deployments  -n<namespace> | grep apmservices-oimetricpublisher-*

b) Set replicas to 0 or remove the deployments:

kubectl scale --replicas=0 deployment apmservices-oimetricpublisher-001 -n<namespace>


2) get internal TOKEN by going to secrets, select apmservices-public, reveal secret and copy from bootstrap.properties value of apm.security.internalToken

a) get the value of bootstrap.properties

kubectl get secret apmservices-public -ndxi -o yaml

apiVersion: v1
data:
  bootstrap.properties: <token>
 
....

b) do base64 decode, it will reveal the human readable content, below an example:

cat <<EOF | base64 --decode

[paste here the above bootstrap.properties long value, for example:
<token> ]

> EOF

c) Result will be as below, get the apm.security.internalToken value:

apm.security.publicKey=/apmservices.sec/systempublic.pem
apm.security.privateKey=
apm.security.supportabilityToken=<token>
apm.security.internalToken=<token>
apm.security.tenantServiceMasterToken=


3) get URL of apmservices-gateway from Routes (e.g.
http://apmservices-gateway.<example.com>/)

In kubernetes:  kubectl get ingress -ndxi | grep apmservices-gateway
In openshift:  kubectl get routes -ndxi | grep apmservices-gateway

 

On command line:

1) get partition registration for oimetricpublisher from tenants using internal token.

Replace <URL> and <TOKEN> by values collected above in template below. This will list of all node registration, focus on oimetricregistration parts of response:

curl -L -X GET '<URL>/tenants/node/fetch' -H 'Authorization: Bearer <TOKEN>'


Example:

curl -L -X GET 'http://apmservices-gateway.<example.com>/tenants/node/fetch' -H 'Authorization: Bearer <token>'


Response:
{"values":[{"serviceId":"oimetricpublisher","instanceId":"<token>","partitionId":1},{"serviceId":"tas",...


2) for each registered partition of oimetricpublisher (it can be more than one if oimetricpublisher was scaled up to multiple instances) call the deregister.

Replace <URL>, <TOKEN> and <INSTANCE_ID> in template below. 

IMPORTANT: Be careful and delete just oimetricpublisher registrations!

curl -L -X POST '<URL>/tenants/node/deregister' -H 'Accept: application/json' -H 'Content-Type: application/json' -H 'Authorization: Bearer <TOKEN>' -d '{"serviceId": "oimetricpublisher","instanceId": "<INSTANCE_ID>"}'

 

Example:

curl -L -X POST 'http://apmservices-gateway.<example.com>/tenants/node/deregister' -H 'Accept: application/json' -H 'Content-Type: application/json' -H 'Authorization: Bearer <token>' -d '{"serviceId": "oimetricpublisher","instanceId": "<token>"}'


Response:
{"deleted":true}


3) Finally call again curl from step#1 to be sure, there is no oimetricpublisher node registered anymore

4) Login to Cluster Management and verify that the oimetricpublisher services have been removed from the UI

Additional Information

https://knowledge.broadcom.com/external/article/190815/dx-aiops-troubleshooting-common-issues-a.html

 

Note:

The oimetricpublisher component was used for dspintegrator processing, which has now been replaced by the 'anomalydetection' component.  Since the dspintegrator is no longer included in 21.3, the oimetricpublisher component is not needed anymore either.