The deployment agent is constantly restarting.
Unclear how to debug why it is happening and whether the parameters used need 'tweaking'.
The container logfile doesn't show anything obvious and gets overwritten each time the container restarts.
Details below:-
sh-4.2$ uname -a
Linux container-monitor-d5dcbbc66-dw2n7 3.10.0-1160.31.1.el7.x86_64 #1 SMP Wed May 26 20:18:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Restarted 99 times since the POD was created:-
container-monitor-d5dcbbc66-dw2n7 1/1 Running 99 9d 22.249.87.61
These are the parameters used: (excerpt)
Restart Count: 99
Limits:
cpu: 2
memory: 1G
Requests:
cpu: 200m
memory: 300Mi
Liveness: http-get http://:8888/healthz delay=60s timeout=1s period=60s #success=1 #failure=3
Environment:
MIN_HEAP_VAL_IN_MB: 512
MAX_HEAP_VAL_IN_MB: 1024
sh-4.2$ ps -ef | grep -i wily
uma 58 55 99 08:16 ? 02:10:18 /usr/local/openshift/apmia/jre/bin/java -server -classpath /usr/local/openshift/apmia/lib/* -Xms512m -Xmx1024m -XX:ErrorFile=logs/jvm_error.%p.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=logs/ com.wily.introscope.agent.uma.UnifiedMonitoringAgent
uma 917 899 0 10:22 ? 00:00:00 grep -i wily
Suggested diagnostic steps, all of this information can be shared with support where raising a case
Run oc get all to get background to the deployment
Focus on pods that have restarted a lot, note the exact container name for use in subsequent commands
pod/clusterinfo-6f756ccd5c-
pod/container-monitor-
pod/app-container-monitor-
Release : 20.2
Component :
Heap was raised for both the container and for the agent process inside it
Limits:
cpu: 2
memory: 2G
MIN_HEAP_VAL_IN_MB: 1024
MAX_HEAP_VAL_IN_MB: 2048
This greatly reduced the amount of restarts, more heap could be allocated if available or desired