How to monitor NIC_Card status, NIC Error Utilization in OpenStack
search cancel

How to monitor NIC_Card status, NIC Error Utilization in OpenStack

book

Article ID: 263990

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

How to monitor NIC_Card status, NIC Error Utilization in Openstack.

  • BR Nic errors
  • Sr-iov network utilization
  • TAP nic errors

Environment

  • Release: 20.4
  • openstack probe v1.38 

Resolution

nic_monitor probe

How to monitor network interfaces via nic_monitor probe?
https://knowledge.broadcom.com/external/article?articleId=259532

At this url you will find the list of metrics that the nic_monitor currently supports:

https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/ca-unified-infrastructure-management-probes/GA/monitoring/systems-and-service-response/nic_monitor-(Network-Interface-Performance-Monitoring)/nic_monitor-(Network-Interface-Performance-Monitoring)-Metrics.html 

Note that the nic_monitor probe is only capable of local monitoring when a robot is installed.

Openstack

OpenStack network troubleshooting and monitoring seems to imply running local commands and leveraging the output.

e.g., ping, tcpdump, tracert/traceroute, etc.

https://docs.openstack.org/operations-guide/ops-network-troubleshooting.html#finding-a-failure-in-the-path 

We do have the nexec probe (runs commands) as well for scripting interfaces that might be able to be used to run commands/scripts but that may be locked down by security.

openstack probe

openstack probe v1.38 hasn't been updated since 2019 but you can try it.
http://support.nimsoft.com/

The openstack probe supports monitoring the following incoming and outgoing network usage statistics of the instances:

- Bytes and Bytes per Second
- Packets and Packets per Second

OpenStack SNMP Support

If the OpenStack SNMP driver is implemented, you may be able to use the snmpcollector (or a simple approach via use of snmpget) to monitor device status, utlization, and errors you mentioned. You can check the OpenStack docs for montoring options including SNMP.

https://docs.openstack.org/ironic/queens/admin/drivers/snmp.html