Upgraded to 23.3 one week ago.
apmservices-nass pod description indicates terminated due to OOMKilled.
apmservices-nass-01 pod restarted after 6 hours -- events do not indicate Liveness/Readiness probe failures.
apmservices-nass-02 pod restarted after 13 hours but events indicate Liveness/Readiness probe failures.
Various other pods report errors:
ERROR c.c.a.r.nass.NassReactiveClientBase - com.ca.apm.common.api.ServicesException: 500,2102,-: No instances for partition (nass, 2), -
com.ca.apm.common.api.ServicesException: 500,2102,-: No instances for partition (nass, 2), -
Increase pod memory limit and change node os kernel param (THP “madvise”) has helped.
With the memory set to the default the OOMKilled situations came up more frequently