NAPP in degraded state due to Core Analytics Service reporting an Alarm
search cancel

NAPP in degraded state due to Core Analytics Service reporting an Alarm

book

Article ID: 368538

calendar_today

Updated On:

Products

VMware vDefend Firewall with Advanced Threat Prevention VMware vDefend Firewall

Issue/Introduction

NAPP was in degraded state and Health API check command shows

 


        {

                            "name": "druid-historical",

                            "readyReplica": 0,

                          "reason": "Back-off restarting failed container;Container image \"projects.registry.vmware.com/nsx_application_platform/clustering/third-party/druid@sha256:e0184c88019ff461652fa1583d68342b88aa3728f5b3992ebc8981c5f9666905\" already present on machine;Readiness probe failed: Get \"https://192.xx.xx.xx:8283/status/health\": dial tcp 192.xx.xx.xx:8283: connect: connection refused;Liveness probe failed: Get \"https://192.xx.xx.xx:8283/status/health\": dial tcp 192.xx.xx.xx:8283: connect: connection refused;Back-off restarting failed container;Container image \"projects.registry.vmware.com/nsx_application_platform/clustering/third-party/druid@sha256:e0184c88019ff461652fa1583d68342b88aa3728f5b3992ebc8981c5f9666905\" already present on machine;Liveness probe failed: Get \"https://192.xx.xx.xx:8283/status/health\": dial tcp 192.xx.xx.xx:8283: connect: connection refused;Readiness probe failed: Get \"https://192.xx.xx.xx:8283/status/health\": dial tcp 192.xx.xx.xx:8283: connect: connection refused;",

                            "status": "DOWN",

                            "totalReplica": 2

 

 

druid historical logs shows :

 

2024-05-24T10:37:52,671 INFO [Segment-Load-Startup-0] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment[13610/27455][active_flow_2023-12-20T01:00:00.000Z_2023-12-20T02:00:00.000Z_2023-12-20T01:00:30.439Z]

2024-05-24T10:37:52,673 INFO [Segment-Load-Startup-0] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment[13611/27455][active_flow_2024-03-10T15:00:00.000Z_2024-03-10T16:00:00.000Z_2024-03-10T15:00:45.290Z_2]

2024-05-24T10:37:52,687 INFO [Segment-Load-Startup-0] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment[13612/27455][active_flow_2024-01-14T11:00:00.000Z_2024-01-14T12:00:00.000Z_2024-01-14T11:00:15.586Z_1]

2024-05-24T10:37:52,700 INFO [Segment-Load-Startup-0] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment[13613/27455][active_flow_2024-01-25T07:00:00.000Z_2024-01-25T08:00:00.000Z_2024-01-25T07:00:15.711Z_2]

2024-05-24T10:37:52,717 INFO [Segment-Load-Startup-0] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment[13614/27455][active_flow_2024-04-07T08:00:00.000Z_2024-04-07T09:00:00.000Z_2024-04-07T08:00:15.583Z_3]

2024-05-24T10:37:52,733 INFO [Segment-Load-Startup-0] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment[13615/27455][active_flow_2024-01-07T13:00:00.000Z_2024-01-07T14:00:00.000Z_2024-01-07T13:00:30.617Z_2]

 

AND

 

2024-05-24T09:43:42,365 INFO [NamespaceExtractionCacheManager-1] org.apache.druid.server.lookup.namespace.JdbcCacheGenerator - Finished loading 40 values (10298 bytes) for [namespace [JdbcExtractionNamespace{connectorConfig=DbConnectorConfig{createTables=true, connectURI='jdbc:postgresql://postgresql-ha-pgpool:5432/pace?ssl=true&usessl=true&sslmode=prefer&socketTimeout=6000&connectTimeout=6000', user='postgres', passwordProvider=org.apache.druid.metadata.DefaultPasswordProvider, dbcpProperties=null}, table='normalizedgroupconfig', keyColumn='managerid', valueColumn='metainfo', tsColumn='null', filter='null', pollPeriod=PT30S, maxHeapPercentage=10}] : org.apache.druid.server.lookup.namespace.cache.CacheScheduler$EntryImpl@73d11bc2] in 37,744,631,597 ns java.lang.OutOfMemoryError: Java heap space Dumping heap to /data/dump/druid/historical ... Unable to create /data/dump/druid/historical: No such file or directory Terminating due to java.lang.OutOfMemoryError: Java heap space

Environment

NSX 3.2 and NAPP 4.0.1

Cause

The Java heap space error in the Druid historical pods was caused by the increasing load on each pod, which led to a memory leak.

Resolution

Please contact Broadcom Support for further assistance