Troubleshoot false-positive BGPSessionDown alarms caused because of an SNMP-V3 Exception
search cancel

Troubleshoot false-positive BGPSessionDown alarms caused because of an SNMP-V3 Exception

book

Article ID: 394594

calendar_today

Updated On:

Products

VMware Smart Assurance Network Observability

Issue/Introduction

This article explains how one could trace the symptoms for a false-positive BGPSession Down alarms generated by the Smarts NPM Domain Manager

Environment

Smarts 10.1.X , 24.3.X

 

Cause

Step-1 : Evaluate the Endpoint that returns the EstablishedStatus as FALSE:

Command to be run on the BGPSession in this instance:

dmctl -s <NPM-BGP Domain> get BGPSession :: <Insert the BGPSession Instance Name Here> 
Example output for further review on this article: 
Properties of BGPSession::BGP-ADJ-<Endpoint-Device1>/X.X.X.X<--><Endpoint-Device2>/Y.Y.Y.Y:
                   DisplayName = BGP-ADJ-<Endpoint-Device1>/X.X.X.X<--><Endpoint-Device2>/Y.Y.Y.Y [65066<-->65066]
                     Endpoint1 = BGPProtocolEndpoint::BGP-EP-<Endpoint-Device2>/Y.Y.Y.Y
          Endpoint1DisplayName = BGP-EP-<Endpoint-Device2>/Y.Y.Y.Y-><Endpoint-Device1>/X.X.X.X [65066<-->65066] [Vl800]
          Endpoint1Established = FALSE
                     Endpoint2 = BGPProtocolEndpoint::BGP-EP-<Endpoint-Device1>/X.X.X.X
          Endpoint2DisplayName = BGP-EP-<Endpoint-Device1>/X.X.X.X-><Endpoint-Device2>/Y.Y.Y.Y [65066<-->65066] [Vl800]
          Endpoint2Established = TRUE
                   Established = FALSE
 
Step-2 : Based on the above references, review the PeerState on the Endpoints listed as EstablishedStatus = FALSE
In the above reference, we see the BGPEndpoint properties for BGPProtocolEndpoint::BGP-EP-<Endpoint-Device2>/Y.Y.Y.Y, which reports SNMPException while polling this endpoint as follows:

Command to be run on the BGPSession in this instance:

dmctl -s <NPM-BGP Domain> get BGPProtocolEndpoint:: <Insert the BGPProtocolEndpoint Instance Name Here> 
Properties of BGPProtocolEndpoint::BGP-EP-<Endpoint-Device2>/Y.Y.Y.Y:
                     Established = FALSE
                            Peer = BGPProtocolEndpoint::BGP-EP-<Endpoint-Device1>/X.X.X.X
                 PeerAdminStatus = <unknown>
                 PeerDisplayName = BGP-EP-<Endpoint-Device1>/X.X.X.X-><Endpoint-Device2>/Y.Y.Y.Y [65066<-->65066] [Vl800]
                       PeerState = SNMP_EXCEPTION
            PeerStatePollingFail = TRUE
                      PeerSystem = Router::<Endpoint-Device1>
                    PollingIndex = Y.Y.Y.Y
               PreviousPeerState = SNMP_EXCEPTION
 
Step-3 : To debug on the exception, we have checked the TCPDump between Smarts NPM Server and <Endpoint-Device2>

Command to be run on the BGPSession in this instance:

sudo tcpdump -i <Smarts-Host interface name> -v -s 0 -w tcpdump.pcap host <Remote Device> and port <Port>
In the TCPDump we can see that the SNMP-Requests are being dropped by the device and hence the SNMPException is reported back on Smarts.
 
Example of an export from your TCPDump data shared:
=====================================================SNMP REQUEST FROM SMARTS ====================================================================
No.     Time           Source                      Destination                       Protocol Length Info
    1 0.000000       Smarts_Host-#.#.#.#         Endpoint-Device2-#.#.#.#           SNMP     377    get-request 1.3.6.1.2.1.15.3.1.3.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.9.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.14.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.10.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.8.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.6.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.5.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.4.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.2.Y.Y.Y.Y 1.3.6.1.2.1.15.3.1.1.Y.Y.Y.Y
Frame 1: 377 bytes on wire (3016 bits), 377 bytes captured (3016 bits)
Ethernet II, Src: VMware_b3:36:f7 (00:50:56:b3:36:f7), Dst: All-HSRP-routers_02 (00:00:0c:07:ac:02)
Internet Protocol Version 4, Src: Smarts_Host-#.#.#.#, Dst: Endpoint-Device2-#.#.#.#
Internet Protocol Version 4, Src: Smarts_Host-#.#.#.#, Dst: Endpoint-Device2-#.#.#.#
User Datagram Protocol, Src Port: 38879, Dst Port: 161
Simple Network Management Protocol
  msgVersion: snmpv3 (3)
  msgGlobalData
  msgAuthoritativeEngineID: SNMPv3-EngineID
  msgAuthoritativeEngineBoots: 8
  msgAuthoritativeEngineTime: 21694613
  msgUserName: SNMPv3-Username
  msgAuthenticationParameters: SNMPv3-msgAuthenticationParameters
  msgPrivacyParameters: SNMPv3-msgPrivacyParameters
  msgData: encryptedPDU (1)
      encryptedPDU […]: <PDU>
          Decrypted ScopedPDU […]: <PDU>
              contextEngineID: SNMPv3-EngineID
              contextName: <MISSING>
              data: get-request (0)
                  get-request
                      request-id: 18431196
                      error-status: noError (0)
                      error-index: 0
                      variable-bindings: 10 items
                          .............
                          1.3.6.1.2.1.15.3.1.2.Y.Y.Y.Y: Value (Null)
                              Object Name: 1.3.6.1.2.1.15.3.1.2.Y.Y.Y.Y (iso.3.6.1.2.1.15.3.1.2.Y.Y.Y.Y)
                              Value (Null)              → This value (NULL) is expected as this is a request phase of communication
==================================================================================================================================================

=====================================================RESPONSE FROM THE DEVICE ====================================================================
No.     Time           Source                            Destination                 Protocol Length Info
    2 0.035951       Endpoint-Device2-#.#.#.#           Smarts_Host-#.#.#.#         SNMP     168    report 1.3.6.1.6.3.15.1.1.5.0
Frame 2: 168 bytes on wire (1344 bits), 168 bytes captured (1344 bits)
Ethernet II, Src: VMware_b3:94:10 (00:50:56:b3:94:10), Dst: VMware_b3:36:f7 (00:50:56:b3:36:f7)
Internet Protocol Version 4, Src: Endpoint-Device2-#.#.#.#, Dst: Smarts_Host-#.#.#.#
User Datagram Protocol, Src Port: 161, Dst Port: 38879
Simple Network Management Protocol
  msgVersion: snmpv3 (3)
  msgGlobalData
  msgAuthoritativeEngineID: SNMPv3-EngineID
  msgAuthoritativeEngineBoots: 8
  msgAuthoritativeEngineTime: 21694614
  msgUserName: SNMPv3-Username
  msgAuthenticationParameters: <MISSING>
  msgPrivacyParameters: <MISSING>
  msgData: plaintext (0)
      plaintext
          contextEngineID: SNMPv3-EngineID
          contextName: <MISSING>
          data: report (8)
              report
                  request-id: 2147483647
                  error-status: noError (0)
                  error-index: 0
                  variable-bindings: 1 item
                      1.3.6.1.6.3.15.1.1.5.0: 543981
                          Object Name: 1.3.6.1.6.3.15.1.1.5.0 (iso.3.6.1.6.3.15.1.1.5.0)
                          Value (Counter32): 543981       → This value keeps incrementing for every incorrect request received on the SNMPAgent
==================================================================================================================================================
 
Now, if we see the response from the device, 1.3.6.1.6.3.15.1.1.5.0 stands for usmStatsWrongDigests → The total number of packets received by the SNMP engine which were dropped because they didn't contain the expected digest value.

Resolution

According to the Cisco Administration Guidelines, The usmStatsWrongDigests is set when the password specified is not correct. Hence, check if both Auth and Priv passwords are correctly configured. 
This error is reported by the agent with its first varbind containing the OID .1.3.6.1.6.3.15.1.1.5.0.

Please evaluate with the Device Administration Team if the credentials configured on the Smarts end are valid and take corrective actions appropriately. 

If the above symptoms do not match, and yet you receive false BGPSession Down alarms, please raise a case with Technical Support for further troubleshooting.