ALERT: Some images may not load properly within the Knowledge Base Article. If you see a broken image, please right-click and select 'Open image in a new tab'. We apologize for this inconvenience.

DX APM - UMA - Troubleshooting and Common issues

book

Article ID: 212472

calendar_today

Updated On:

Products

DX Application Performance Management CA Application Performance Management SaaS INTROSCOPE CA Application Performance Management Agent (APM / Wily / Introscope)

Issue/Introduction

The following is a high-list of techniques and suggestions to employ when troubleshooting UMA performance, display and configuration issues

 

 

Cause

 

 

Environment

DX APM Agent 20.2, SaaS

Resolution

A) Common issues


USE-CASE#1:
Agent metric clamp reached

Open Metric View

Go to Supportability Metrics : Go to Custom Metric Host (Virtual) | Custom Metric Process (Virtual) | Customer Metric Agent (Virtual) | Agents 

Locate your agent

Check if "Is Clamped" for agents is = 1 for UMA nodes, you can find below an example.

Check also the Metric Count

USE-CASE#2: We are able to get clusterinfo, nodes, and other performance metrics but we are not able to get any performance data for the java application 

Checklist:

From app-container-monitor pod log:

1) Check for possible Memory issues

[INFO] [IntroscopeAgent.AutoAttach.Java.UnixDockerAttacher] Not enough free memory available on host to attach to unbounded container [ namespace/digital-factory-uat pod/logstash-sync-db-to-elk-5cc7445d7d-zxrd6 container/logstash-sync-db-to-elk id/56f7bfa2617f2e0f26aa72c8906ee95de9db813c9571ba272b689ae0d87ea310 ]. Skipping attach

identified as Java process in container [ namespace/digital-factory-uat pod/elasticsearch-master-2 container/elasticsearch id/a3004d2bdddd9e246dc9109614f4441d7ddff98716e73e0470afb9cfc2979a49 ]
3/23/21 09:00:19 AM GMT [INFO] [IntroscopeAgent.AutoAttach.Java.UnixDockerAttacher] Container a3004d2bdddd9e246dc9109614f4441d7ddff98716e73e0470afb9cfc2979a49 has lesser memory than configured free memory threshold of 50.0%, Skipping attach

Recommendation:

Change the below default memory threshold to 25% , by changing the value of below env, shown below
        - name: apmenv_autoattach_free_memory_threshold
          value: "25.00"

When using Operator you can't change anything on UMA side, Operator will revert back the change, in this case you can set annotation on application  pod or deployment level as below:

oc annotate pod <pod-name> ca.broadcom.com/autoattach.java.attach.overrides=autoattach.free.memory.threshold=20 -n <app-ns> --overwrite
oc annotate deployment <deployment-name> d ca.broadcom.com/autoattach.java.attach.overrides=autoattach.free.memory.threshold=20 -n <app-ns> --overwrite


2) Check for a possible unsupported JVM

[INFO] [IntroscopeAgent.AutoAttach.Java.UnixDockerAttacher] Process 1 in container [ namespace/digital-factory-uat pod/payment-jobs-7dcd9fd4fc-klm9r container/payment id/636dbbc6da32ff484798b67760810b05027907e190a4d72cc72876d08756472e ] is an unsupported JVM. Skipping attach. JVMInfo: JVMInfo{ binaryPath='/usr/lib/jvm/java-1.8-openjdk/jre/bin/java', vendorName='IcedTea', vmName='OpenJDK 64-Bit Server VM', runtimeVersion='1.8.0_212-b04', specificationVersion='8' }

Recommendation:

Add the below env to the podmonitor container. (in the same section where the above memory threshold env is present). This will make UMA try to attach java agents to containers that are using unsupported JVMs.

    - name: apmenv_autoattach_java_filter_jvms
       value: "false"

When using Operator you can't change anything on UMA side, Operator will revert back the change, in this case you can set annotation on application  pod or deployment level as below:

oc annotate pod <pod-name> ca.broadcom.com/autoattach.java.attach.overrides=autoattach.java.filter.jvms=false -n <app-ns> --overwrite
oc annotate deployment <deployment-name> d ca.broadcom.com/autoattach.java.attach.overrides=autoattach.java.filter.jvms=false -n <app-ns> --overwrite


3) Check if Java Agent cannot be injected because of permission issue

non-root user is not able to create a new directory in the pod to copy java agent 

Recommendation:

"exec" into the container and then create a folder like /opt (or anything else) and then use the below annotation so Java agent is deployed in that folder:

kubectl annotate pod <application podname> ca.broadcom.com/autoattach.java.attach.overrides=autoattach.java.agent.deps.directory=/opt

If that works, then  modify their Docker app image(s) to to make room for a writeable directory so UMA can use it to inject the agent.


4) Check if the issue is related to java itself 

9/07/21 06:33:09 AM GMT [INFO] [IntroscopeAgent.AutoAttach.Java.UnixDockerEnricher] Process 1 in container [ namespace/tams-test pod/tams--437- id/8cc8bc8221f6e0aa15873ea8e582158c083a4a1781e7295c3ede34ea7d2e6f7f ] could not get jvm information. Skipping attach

Recommendation:

exec the pod and try to execute java, make sure it runs successfully, here is an example illustrating a java problem and the reason of the above message so the java agent could not be added to the container.

In this specific use case the solution was to remove the JAVA_TOOL_OPTIONS. You should contact your application team to fix this java issue

 

USE-CASE#3: Missing metrics from UMA agent

Checklist:

1)

20-05-2021 10:37:56 [pool-5-thread-8] ERROR c.c.a.b.s.OpenshiftClusterCrawlerService.watchDeploymentConfigs - error occurred in watchDeploymentConfigs, null

Exception in thread "OkHttp Dispatcher" java.lang.OutOfMemoryError: unable to create new native thread
     at java.lang.Thread.start0(Native Method)
     at java.lang.Thread.start(Thread.java:717)

Solution

Insufficient memory given for clusterinfo java process. Increase the max heap to 1024m , as shown in below line in the uma yaml file and redeploy the UMA. The below line is part of clusterinfo deployment configuration in the yaml file. Changing the memory should resolve the issue.

command: ["/usr/local/openshift/apmia/jre/bin/java", "-Xms64m","-Xmx1024m", "-Dlogging.config=file:/usr/local/openshift/logback.xml", "-jar", "/clusterinfo-1.0.jar"]


2) 

oc logs pod/container-monitor-7dcdbc5fb8-6vcvq

5/21/21 12:48:20 PM UTC [ERROR] [IntroscopeAgent.GraphSender] error occurred while sending graph to EM, null
java.lang.NullPointerException
     at com.ca.apm.clusterdatareporter.K8sMetaDataGraphAttributeDecorator.getGraph(K8sMetaDataGraphAttributeDecorator.java:107)
     at java.lang.Iterable.forEach(Iterable.java:75)
     at com.ca.apm.clusterdatareporter.K8sMetaDataGraphAttributeDecorator.getGraph(K8sMetaDataGraphAttributeDecorator.java:93)
     at com.ca.apm.clusterdatareporter.K8sMetadataGraphHelper$GraphSender.sendGraph(K8sMetadataGraphHelper.java:601)
     at com.ca.apm.clusterdatareporter.K8sMetadataGraphHelper$GraphSender.sendGraphInBatches(K8sMetadataGraphHelper.java:580)
     at com.ca.apm.clusterdatareporter.K8sMetadataGraphHelper$GraphSender.lambda$1(K8sMetadataGraphHelper.java:548)
     at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
     at java.lang.Thread.run(Thread.java:748)

Reason:

If you are using 10.7 EM, this error can be ignored, there is not loss of functionality

 

3) Below error is reported continuous – every 2 mins:-

5/25/21 08:48:20 AM UTC [ERROR] [IntroscopeAgent.GraphSender] error occurred while sending graph to EM, null
java.lang.NullPointerException

Solution:

If you are using SaaS APM, you need to set the agentManager_version to empty value (i.e. ""), you can do this by changing the following parameter "agentManager_version: "" ", in the yaml.

NOTE: If you are using APM EM 10.7, you need to set version = 10.7 as below for example. This is required to allow UMA to connect to APM EM 10.7. This property is the equivalent to "introscope.agent.connection.compatibility.version" in the Java agent.

 

USE-CASE#4: Liveness probe failed: find: /tmp/apmia-health/extensions/Docker-health.txt: No such file or directory

Symptoms: you don't see any problem with connected agents but you noticed that there are many app-container-monitor reporting above message:

This is a known issue fixed in 21.4 and onward UMA versions

Recommendation: upgrade 

 

B) What diagnostic files should I gather for Broadcom Support?

Collect logs from the following pods:

-app-container-monitor-* (for each node)
-cluster-performance-prometheus-*
-clusterinfo-*
-container-monitor-*

Here is an example of the commands (if you are using openshift you can use "oc" command):

kubectl logs <app-container-pod-name> --all-containers -ncaapm
kubectl logs <cluster-performance-prometheus> --all-containers -ncaapm
kubectl logs <clusterinfo-pod-name> --all-containers -ncaapm
kubectl logs <container-monitor-pod-name> --all-containers -ncaapm

NOTE: If the issue is related to java agent not getting injected as expected, then the most important log to collect is the app-container-<pod-name,  you should have 1 app-container-pod-name on each k8s node, make sure to collect the log from the right node where the issue is happening (from where your application pod is running)

Additional Information

https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/dx-apm-saas/SaaS/implementing-agents/Universal-Monitoring-Agent/Install-the-Universal-Monitoring-Agent/Install-and-Configure-UMA-for-Kubernetes,-Google-Kubernetes-Engine,-and-Azure-Kubernetes-Service-Monitoring.html

Attachments