We are seeing an issue where one of our collectors is dying. We can start the process up successfully for a few minutes, then it crashes. There are no errors in the logs prior to the crash. I have tried using EMCtrl.sh and Watchdog.sh. When using Watchdog, restarts are not working as the port is showing in use. I check netstat and 5001 is not in use.
Environment
Release : 10.7.0
Component : APM Agents
Resolution
Looks like our Linux admin discovered the issue. A Satellite process was using up all the swap space due to some communications between it and the Satellite server. He restarted the service, which freed up the swap space. Restarted the controller and it has been running for over an hour now. Will continue to monitor but it appears to have corrected the issue.