Troubleshooting Alarms and Performance Issues in NSX Application Platform

search cancel

Troubleshooting Alarms and Performance Issues in NSX Application Platform

book

Article ID: 366906

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

Alarms and performance issues may arise due to various factors such as resource contention, configuration errors, software bugs, or infrastructure issues, this article provides guidance for identifying, understanding, and resolving various alarms and performance issues within NAPP, ensuring the stability and reliability of the platform.

Environment

The KB is applicable if you are using NSX Application Platform (NAPP)

Cause

Please refer to Resolution section of this article. Look up by Alarm Summary.

Resolution

1.1. Cluster Alarms:

1. Cluster CPU high alarm.

Component Name: nsx_application_platform_health.cluster_cpu_usage_high
Summary: NSX Application Platform cluster CPU usage is high.
Description: The CPU usage of NSX Application Platform cluster {napp_cluster_id} is above the high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the System Load field of individual services to see which service is under pressure. See if the load can be reduced. If more computing power is required, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

2. Cluster CPU very high alarm.

Component Name: nsx_application_platform_health.cluster_cpu_usage_very_high
Summary: NSX Application Platform cluster CPU usage is very high.
Description: The CPU usage of NSX Application Platform cluster {napp_cluster_id} is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the System Load field of individual services to see which service is under pressure. See if the load can be reduced. If more computing power is required, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

3. Cluster memory high alarm.

Component Name: nsx_application_platform_health.cluster_memory_usage_high
Summary: NSX Application Platform cluster memory usage is high
Description: The memory usage of NSX Application Platform cluster {napp_cluster_id} is above the high threshold value of {system_usage_threshold}%
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Memory field of individual services to see which service is under pressure. See if the load can be reduced. If more memory is required, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

4. Cluster memory very high alarm.

Component Name: nsx_application_platform_health.cluster_memory_usage_very_high
Summary: NSX Application Platform cluster memory usage is very high
Description: The memory usage of NSX Application Platform cluster {napp_cluster_id} is above the very high threshold value of {system_usage_threshold}%
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Memory field of individual services to see which service is under pressure. See if the load can be reduced. If more memory is required, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

5. Cluster disk usage high alarm.

Component Name: nsx_application_platform_health.cluster_disk_usage_high
Summary: NSX Application Platform cluster disk usage is high
Description: The disk usage of NSX Application Platform cluster {napp_cluster_id} is above the high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Storage field of individual services to see which service is under pressure. See if the load can be reduced. If more disk storage is required, click on the Scale Out button to request more resources. If data storage service is under strain, another way is to click on the Scale Up button to increase disk size.
Release Introduced: 3.2.0

6. Cluster disk usage very high alarm.

Component Name: nsx_application_platform_health.cluster_disk_usage_very_high
Summary: NSX Application Platform cluster disk usage is very high
Description: The disk usage of NSX Application Platform cluster {napp_cluster_id} is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Storage field of individual services to see which service is under pressure. See if the load can be reduced. If more disk storage is required, click on the Scale Out button to request more resources. If data storage service is under strain, another way is to click on the Scale Up button to increase disk size.
Release Introduced: 3.2.0

7. Cluster status degraded alarm.

Component Name: nsx_application_platform_health.napp_status_degraded
Summary: NSX Application Platform cluster overall status is degraded
Description: NSX Application Platform cluster {napp_cluster_id} overall status is degraded
Recommended Action: Get more information from alarms of nodes and services.
KB Link : https://knowledge.broadcom.com/external/article?articleNumber=378723
Release Introduced: 3.2.0

8. Cluster status down alarm.

Component Name: nsx_application_platform_health.napp_status_down
Summary: NSX Application Platform cluster overall status is down
Description: NSX Application Platform cluster {napp_cluster_id} overall status is down.
Recommended Action: Get more information from alarms of nodes and services.
Release Introduced: 3.2.0

9. Flow storage growth high

Component Name: nsx_application_platform_health.flow_storage_growth_high
Summary: Analytics and Data Storage disk usage is growing faster than expected
Description: Analytics and Data Storage disks are expected to be full in {predicted_full_period} days, less than current data retention period {current_retention_period} days.
Recommended Action: Connect less transport nodes or set narrower private IP ranges to reduce the number of unique flows. Filter out broadcast and/or multcast flows. Scale out Analytics and Data Storage services to get more storage
KB Link : https://knowledge.broadcom.com/external/article/319828/nsx-application-platform-napp-disk-usage.html
Release Introduced: 4.1.1

1.2. Node Alarms:

1. Node CPU high alarm.

Component Name: nsx_application_platform_health.node_cpu_usage_high
Summary: NSX Application Platform node CPU usage is high
Description: The CPU usage of NSX Application Platform node {napp_node_name} is above the high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the System Load field of individual services to see which service is under pressure. See if load can be reduced. If only a small minority of the nodes have high CPU usage, by default, Kubernetes will reschedule services automatically. If most nodes have high CPU usage and load cannot be reduced, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

2. Node CPU very high alarm.

Component Name: nsx_application_platform_health.node_cpu_usage_very_high
Summary: NSX Application Platform node CPU usage is very high
Description: The CPU usage of NSX Application Platform node {napp_node_name} is above the very high threshold value of {system_usage_threshold}%..
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the System Load field of individual services to see which service is under pressure. See if load can be reduced. If only a small minority of the nodes have high CPU usage, by default, Kubernetes will reschedule services automatically. If most nodes have high CPU usage and load cannot be reduced, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

3. Node memory high alarm.

Component Name: nsx_application_platform_health.node_memory_usage_high
Summary: NSX Application Platform node memory usage is high
Description: The memory usage of NSX Application Platform node {napp_node_name} is above the high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Memory field of individual services to see which service is under pressure. See if load can be reduced. If only a small minority of the nodes have high Memory usage, by default, Kubernetes will reschedule services automatically. If most nodes have high Memory usage and load cannot be reduced, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

4. Node memory very high alarm.

Component Name: nsx_application_platform_health.node_disk_usage_highmemory_usage_very_high
Summary: NSX Application Platform node memory usage is very high
Description: The memory usage of NSX Application Platform node {napp_node_name} is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Memory field of individual services to see which service is under pressure. See if load can be reduced. If only a small minority of the nodes have high Memory usage, by default, Kubernetes will reschedule services automatically. If most nodes have high Memory usage and load cannot be reduced, click on the Scale Out button to request more resources.
Release Introduced: 3.2.0

5. Node disk usage high alarm.

Component Name: nsx_application_platform_health.node_disk_usage_high
Summary: NSX Application Platform node disk usage is high
Description: The disk usage of NSX Application Platform node {napp_node_name} is above the high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Storage field of individual services to see which service is under pressure. Clean up unused data or log to free up disk resources and see if the load can be reduced. If more disk storage is required, Scale Out the service under pressure. If data storage service is under strain, another way is to click on the Scale Up button to increase disk size.
KB Link: https://knowledge.broadcom.com/external/article/378730
Release Introduced: 3.2.0

6. Node disk usage very high alarm.

Component Name: nsx_application_platform_health.node_disk_usage_very_high
Summary: NSX Application Platform node disk usage is very high
Description: The disk usage of NSX Application Platform node {napp_node_name} is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services and check the Storage field of individual services to see which service is under pressure. Clean up unused data or log to free up disk resources and see if the load can be reduced. If more disk storage is required, Scale Out the service under pressure. If data storage service is under strain, another way is to click on the Scale Up button to increase disk size.
KB Link: https://knowledge.broadcom.com/external/article/378730
Release Introduced: 3.2.0

7. Node status degraded alarm.

Component Name: nsx_application_platform_health.node_status_degraded
Summary: NSX Application Platform node status is degraded.
Description: NSX Application Platform node {napp_node_name} is degraded.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Resources to check which node is degraded. Check network, memory and CPU usage of the node. Reboot the node if it is a worker node.
Release Introduced: 3.2.0

8. Node status down alarm.

Component Name: nsx_application_platform_health.node_status_down
Summary: NSX Application Platform node status is down
Description: NSX Application Platform node {napp_node_name} is not running
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Resources to check which node is down. Check network, memory and CPU usage of the node. Reboot the node if it is a worker node.
Release Introduced: 3.2.0

1.3.
Data storage alarms:

1. Data Storage service CPU high alarm.

Component Name: nsx_application_platform_health.datastore_cpu_usage_high
Summary: Data Storage service CPU usage is high
Description: The CPU usage of Data Storage service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Data Storage service.
Release Introduced: 3.2.0

2. Data Storage service CPU very high alarm.

Component Name: nsx_application_platform_health.datastore_cpu_usage_very_high
Summary: Data Storage service CPU usage is very high
Description: The CPU usage of Data Storage service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Data Storage service.
Release Introduced: 3.2.0

3. Data Storage service memory high alarm.

Component Name: nsx_application_platform_health.datastore_memory_usage_high
Summary: Data Storage service memory usage is high
Description: The memory usage of Data Storage service is above the high threshold value of {system_usage_threshold}%
Recommended Action: Scale out all services or the Data Storage service.
Release Introduced: 3.2.0

4. Data Storage service memory very high alarm.

Component Name: nsx_application_platform_health.datastore_memory_usage_very_high
Summary: Data Storage service memory usage is very high
Description: The memory usage of Data Storage service is above the very high threshold value of {system_usage_threshold}%
Recommended Action: Scale out all services or the Data Storage service.
Release Introduced: 3.2.0

5. Data Storage service disk usage high alarm.

Component Name: nsx_application_platform_health.datastore_disk_usage_high
Summary: Data Storage service disk usage is high
Description: The disk usage of Data Storage service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out or scale up the data storage service
Release Introduced: 3.2.0

6. Data Storage service disk usage very high alarm.

Component Name: nsx_application_platform_health.datastore_disk_usage_very_high
Summary: Data Storage service disk usage is very high
Description: The disk usage of Data Storage service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out or scale up the data storage service
Release Introduced: 3.2.0

1.4. Messaging Service Alarms:

1. Messaging service CPU high alarm.

Component Name: nsx_application_platform_health.messaging_cpu_usage_high
Summary: Messaging service CPU usage is high.
Description: The CPU usage of Messaging service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Messaging service
Release Introduced: 3.2.0

2. Messaging service CPU very high alarm.

Component Name: nsx_application_platform_health.messaging_cpu_usage_very_high
Summary: Messaging service CPU usage is very high.
Description: The CPU usage of Messaging service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Messaging service
Release Introduced: 3.2.0

3. Messaging service memory high alarm.

Component Name: nsx_application_platform_health.messaging_memory_usage_high
Summary: Messaging service memory usage is high
Description: The memory usage of Messaging service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Messaging service.
Release Introduced: 3.2.0

4. Messaging service memory very high alarm.

Component Name: nsx_application_platform_health.messaging_memory_usage_very_high
Summary: Messaging service memory usage is very high
Description: The memory usage of Messaging service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Messaging service.
Release Introduced: 3.2.0

5. Messaging service disk usage high alarm.

Component Name: nsx_application_platform_health.messaging_disk_usage_high
Summary: Messaging service disk usage is high
Description: The disk usage of Messaging service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Clean up files not needed. Scale out all services or the Messaging service
Release Introduced: 3.2.0

6. Messaging service disk usage very high alarm.

Component Name: nsx_application_platform_health.messaging_disk_usage_very_high
Summary: Messaging service disk usage is very high
Description: The disk usage of Messaging service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Clean up files not needed. Scale out all services or the Messaging service
Release Introduced: 3.2.0

1.5.
Analytics Service Alarms:

1. Analytics service CPU high alarm.

Component Name: nsx_application_platform_health.analytics_cpu_usage_high
Summary: Analytics service CPU usage is high.
Description: The CPU usage of Analytics service is above the high threshold value of {system_usage_threshold}%
Recommended Action: Scale out all services or the Analytics service.
Release Introduced: 3.2.0

2. Analytics service CPU very high alarm.

Component Name: nsx_application_platform_health.analytics_cpu_usage_very_high
Summary: Analytics service CPU usage is very high.
Description: The CPU usage of Analytics service is above the very high threshold value of {system_usage_threshold}%
Recommended Action: Scale out all services or the Analytics service.
Release Introduced: 3.2.0

3. Analytics service memory high alarm.

Component Name: nsx_application_platform_health.analytics_memory_usage_high
Summary: Analytics service memory usage is high
Description: The memory usage of Analytics service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Analytics service
Release Introduced: 3.2.0

4. Analytics service memory very high alarm.

Component Name: nsx_application_platform_health.analytics_memory_usage_very_high
Summary: Analytics service memory usage is very high
Description: The memory usage of Analytics service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services or the Analytics service
Release Introduced: 3.2.0

5. Anaytics service disk usage high alarm.

Component Name: nsx_application_platform_health.analytics_disk_usage_high
Summary: Analytics service disk usage is high.
Description: The disk usage of Analytics service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Clean up files not needed. Scale out all services or the Analytics service.
Release Introduced: 3.2.0

6. Analytics service disk usage very high alarm.

Component Name: nsx_application_platform_health.analytics_disk_usage_very_high
Summary: Analytics service disk usage is very high
Description: The disk usage of Analytics service is above the very high threshold value of {system_usage_threshold}%
Recommended Action: Scale out all services or the Analytics service
Release Introduced: 3.2.0

1.6. Config DB service:

1. Config DB service CPU high alarm.

Component Name: nsx_application_platform_health.configuration_db_cpu_usage_high
Summary: Configuration Database service CPU usage is high.
Description: The CPU usage of Configuration Database service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services.
Release Introduced: 3.2.0

2. Config DB service CPU very high alarm.

Component Name: nsx_application_platform_health.configuration_db_cpu_usage_very_high
Summary: Configuration Database service CPU usage is very high.
Description: The CPU usage of Configuration Database service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services.
Release Introduced: 3.2.0

3. Config DB service memory high alarm.

Component Name: nsx_application_platform_health.configuration_db_memory_usage_high
Summary: Configuration Database service memory usage is high
Description: The memory usage of Configuration Database service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services
Release Introduced: 3.2.0

4. Config DB service memory very high alarm.

Component Name: nsx_application_platform_health.configuration_db_memory_usage_very_high
Summary: Configuration Database service memory usage is very high.
Description: The memory usage of Configuration Database service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services.
Release Introduced: 3.2.0

health.datastore
5. Config DB service disk usage high alarm.

Component Name: nsx_application_platform_health.configuration_db_disk_usage_high
Summary: Configuration Database service disk usage is high
Description: The disk usage of Configuration Database service is above the high threshold value of {system_usage_threshold}%
Recommended Action: Clean up files not needed. Scale out all services.
Release Introduced: 3.2.0

6. Config DB service disk usage very high alarm.

Component Name: nsx_application_platform_health.configuration_db_disk_usage_very_high
Summary: Configuration Database service disk usage is very high
Description: The disk usage of Configuration Database service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Clean up files not needed. Scale out all services.
Release Introduced: 3.2.0

1.7. Metrics Alarms:

1. Metrics service CPU high alarm.

Component Name: nsx_application_platform_health.metrics_cpu_usage_high
Summary: Metrics service CPU usage is high
Description: The CPU usage of Metrics service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services.
Release Introduced: 3.2.0

2. Metrics service CPU very high alarm.

Component Name: nsx_application_platform_health.metrics_cpu_usage_very_high
Summary: Metrics service CPU usage is very high
Description: The CPU usage of Metrics service is above the very high threshold value of {system_usage_threshold}%..
Recommended Action: Scale out all services
Release Introduced: 3.2.0

3. Metrics service memory high alarm.

Component Name: nsx_application_platform_health.metrics_memory_usage_high
Summary: Metrics service memory usage is high
Description: The memory usage of Metrics service is above the high threshold value of {system_usage_threshold}%
Recommended Action: Scale out all services
Release Introduced: 3.2.0

4. Metrics service memory very high alarm.

Component Name: nsx_application_platform_health.metrics_memory_usage_very_high
Summary: Metrics service memory usage is very high
Description: The memory usage of Metrics service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services
Release Introduced: 3.2.0

5. Metrics service disk usage high alarm.

Component Name: nsx_application_platform_health.metrics_disk_usage_high
Summary: The disk usage of Metrics service is above the high threshold value of {system_usage_threshold}%.
Description: The disk usage of Metrics service is below the high threshold value of {system_usage_threshold}%.
Recommended Action: Follow the steps at https://knowledge.broadcom.com/external/article?legacyId=93274
Release Introduced: 3.2.0

6. Metrics service disk usage very high alarm.

Component Name: nsx_application_platform_health.metrics_disk_usage_very_high
Summary: Metrics service disk usage is very high.
Description: The disk usage of Metrics service is above the very high threshold value of {system_usage_threshold}%
Recommended Action: Follow the steps at https://knowledge.broadcom.com/external/article?legacyId=93274
Release Introduced: 3.2.0

1.8. Platform Alarms:

1. Platform service CPU high alarm.

Component Name: nsx_application_platform_health.platform_cpu_usage_high
Summary: Platform Services service CPU usage is high
Description: The CPU usage of Platform Services service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services
Release Introduced: 3.2.0

2. Platform service CPU very high alarm.

Component Name: nsx_application_platform_health.platform_cpu_usage_very_high
Summary: Platform Services service CPU usage is very high
Description: The CPU usage of Platform Services service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services
Release Introduced: 3.2.0

3. Platform service memory high alarm.

Component Name: nsx_application_platform_health.platform_memory_usage_high
Summary: Platform Services service memory usage is high
Description: The memory usage of Platform Services service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Scale out all services
Release Introduced: 3.2.0

4. Platform service memory very high alarm.

Component Name: nsx_application_platform_health.platform_memory_usage_very_high
Summary: Platform Services service memory usage is very high
Description: The memory usage of Platform Services service is above the very high threshold value of {system_usage_threshold}%
Recommended Action: Scale out all services
Release Introduced: 3.2.0

5. Platform service disk usage high alarm.

Component Name: nsx_application_platform_health.platform_disk_usage_high
Summary: Platform Services service disk usage is high
Description: The disk usage of Platform Services service is above the high threshold value of {system_usage_threshold}%.
Recommended Action: Invoke the command to get disk usage : `napp-k exec -it $(napp-k get pods | grep cluster | cut -d ' ' -f 1) -c cluster-api -- sh -c 'kubectl df-pv'` Clean up files not needed. Scale out all services.
Release Introduced: 3.2.0

6. Platform service disk usage very high alarm.

Component Name: nsx_application_platform_health.platform_disk_usage_very_high
Summary: Platform Services service disk usage is very high
Description: The disk usage of Platform Services service is above the very high threshold value of {system_usage_threshold}%.
Recommended Action: Clean up files not needed. Scale out all services.
Release Introduced: 3.2.0

1.9. Service Status Alarm:

1. Service status degraded alarm.

Component Name: nsx_application_platform_health.service_status_degraded
Summary: "Service status is degraded.
Description: Service {napp_service_name} is degraded. The service may still be able to reach a quorum while pods associated with {napp_service_name} are not all stable. Resources consumed by these unstable pods may be released.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services to check which service is degraded. Invoke the NSX API GET /napp/api/v1/platform/monitor/feature/health to check which specific service is degraded and the reason behind it. Invoke the following CLI command to restart the degraded service if necessary: `kubectl rollout restart <statefulset/deployment> <service_name> -n <namespace>` Degraded services can function correctly but performance is sub-optimal.
Release Introduced: 3.2.0

2. Service status down alarm.

Component Name: nsx_application_platform_health.service_status_down
Summary: Service status is down
Description: Service {napp_service_name} is not running.
Recommended Action: In the NSX UI, navigate to System | NSX Application Platform | Core Services to check which service is degraded. Invoke the NSX API GET /napp/api/v1/platform/monitor/feature/health to check which specific service is down and the reason behind it. Follow the steps at https://knowledge.broadcom.com/external/article?legacyId=96890
Release Introduced: 3.2.0

3. Manager disconnected alarm.

Component Name: nsx_application_platform_communication.manager_disconnected
Summary: The NSX Application Platform cluster is disconnected from the NSX management cluster
Description: The NSX Application Platform cluster {napp_cluster_id} is disconnected from the NSX management cluster.
Recommended Action: Check whether the manager cluster certificate, manager node certificates, kafka certificate and ingress certificate match on both NSX Manager and the NSX Application Platform cluster. Check expiration dates of the above mentioned certificates to make sure they are valid. Check the network connection between NSX Manager and NSX Application Platform cluster and resolve any network connection failures.
KB Link : https://knowledge.broadcom.com/external/article/378726
Release Introduced: nsx_application_platform_health 3.2.0

1.10. Kafka lag alarms

1. Data processing slow in kafka topic Raw Flow.

Component Name: nsx_application_platform_communication.delay_detected_in_messaging_rawflow
Summary: Slow data processing detected in messaging topic Raw Flow.
Description: The number of pending messages in the messaging topic Raw Flow is above the pending message threshold of {napp_messaging_lag_threshold}.
Recommended Action: Add nodes and then scale up the NSX Application Platform cluster. If the bottleneck can be attributed to a specific service, for example, the analytics service, then scale up the specific service when the new nodes are added. If you are unable to scaleout the cluster immediately, then you can try one of the other options in this KB broadcom.com/external/article?legacyId=91932
KB link: https://knowledge.broadcom.com/external/article?legacyId=91932
Release Introduced: 3.2.0

2. Data processing slow in kafka topic Over Flow.

Component Name: nsx_application_platform_communication.delay_detected_in_messaging_overflow
Summary: Slow data processing detected in messaging topic Over Flow.
Description: The number of pending messages in the messaging topic Over Flow is above the pending message threshold of {napp_messaging_lag_threshold}
Recommended Action: Add nodes and then scale up the NSX Application Platform cluster. If bottleneck can be attributed to a specific service, for example, the analytics service, then scale up the specific service when the new nodes are added. If you are unable to scaleout the cluster immediately, then you can try one of the other options in this KB broadcom.com/external/article?legacyId=91932
KB link: https://knowledge.broadcom.com/external/article?legacyId=91932
Release Introduced: 3.2.0

1.11. TN alarms

TN flow exp disconnected

Component Name: nsx_application_platform_communication.tn_flow_exp_disconnected
Summary: A Transport node is disconnected from its NSX Messaging Broker
Description: The flow exporter on Transport node {entity_id} is disconnected from its messaging broker {messaging_broker_info}. Data collection is affected
Recommended Action: Restart the messaging service if it is not running. Resolve the network connection failure between the Transport node flow exporter and its NSX messaging broker.
KB link: https://knowledge.broadcom.com/external/article?articleNumber=330482
Release Introduced: 3.2.0

2. TN flow exp disconnected on dpu

Component Name: nsx_application_platform_communication.tn_flow_exp_disconnected
Summary: A Transport node is disconnected from its NSX messaging broker
Description: The flow exporter on Transport node {entity_id} DPU {dpu_id} is disconnected from its messaging broker {messaging_broker_info}. Data collection is affected
Recommended Action: Restart the messaging service if it is not running. Resolve the network connection failure between the Transport node flow exporter and its NSX messaging broker.
Release Introduced: 4.0.0

Additional Information

Feedback

thumb_up Yes

thumb_down No