Console Slowness / Disconnected Agents After Replacing Agent SSL Certificate
search cancel

Console Slowness / Disconnected Agents After Replacing Agent SSL Certificate

book

Article ID: 286506

calendar_today

Updated On:

Products

Carbon Black App Control (formerly Cb Protection)

Issue/Introduction

  • Agent Communication Certificate (System Configuration > Security) recently replaced
  • Console Slowness
  • Unstable Agent Connections
  • Server High Debug Logs shows frequent hits for "METHOD_SECURE_MESSAGE":
    ParseMessageHeader(): [action: METHOD_SECURE_MESSAGE] Session GUID From Message Header [0000-00-00]

Environment

  • App Control Server: 8.7.0 - 8.10.0
  • App Control Agent: 8.7 and Higher

Cause

When a new Agent Communication Certificate is generated or imported, it is by default staged for 60 minutes (Server 8.9.4-8.10.0). During this time, Agents will continue communicating with the Server using the old, Trusted Certificate, but be told by the Server to download the new one (TrustedCertList.pem file).

Once the staging period elapses the Server will mark the new Communication Certificate as active and Agents that have already downloaded it, will be able to seamlessly continue communicating without any issue. Agents that have not downloaded it, must have their traffic manually decrypted by the server (METHOD_SECURE_MESSAGE) until they receive the updated file. This manual decryption is resource intensive and having many Agents using this route can cause extreme Console slowness and Agent connection failures.

Resolution

More graceful ways to handle this are included with the release of Server 8.10.2 (EP-20097) and upgrading should reduce the chance of this occurring.

If currently encountering this there are two known workarounds:

  1. Manually import the TrustedCertList.pem file on each device.
  2. Throttle the Agent traffic (port 41002) to limit the number of connected devices, gradually increasing and allowing more subnets as they receive the TrustedCertList.pem file.

For further assistance please contact Carbon Black Support.

Additional Information

To help avoid this, prior to replacing the Communication Certificate customers can:

  1. Change the CertificateDelaySwapMinutes to a higher value (Server versions 8.9.4+). 
    1. Log in to the Console and navigate to https://<ServerName>/shepherd_config.php
    2. Find the Property CertificateDelaySwapMinutes and increase the Value
      • For instance, a week would be the Value 10080
      • This value should not exceed the existing Communication Certificate expiration timeframe.
  2. Replace the Communication Certificate at least 7 days (or whatever CertificateDelaySwapMinutes is set to) to take advantage of the longer staging period.

NOTE:

  • There is a separate issue (fixed in Server 8.10.2) where Agents remain disconnected for the duration of CertificateDelaySwapMinutes. This only occurs when the certificate has already expired.
  • Increasing CertificateDelaySwapMinutes after replacing the Communication Certificate, or after it’s already expired will not help. It must be done beforehand.
  • If the previous certificate had expired make sure that the IIS cert has been updated correctly or the agents may be failing to download the new cert.