Aria Operations for Networks shows increase in Indexer Lag keeps increasing up to 2 weeks
search cancel

Aria Operations for Networks shows increase in Indexer Lag keeps increasing up to 2 weeks

book

Article ID: 411232

calendar_today

Updated On:

Products

VCF Operations for Networks

Issue/Introduction

  1. Upgrade was attempted and failed, all nodes were restored to snapshots.
  2. Post snapshot restore of 3 Platform nodes and 2 collector nodes Indexer Lag keeps increasing up to 2 weeks.

    GUI shows below for Indexer Lag:


  3. Verify on each Platform node that the Indexing Lag is consistent across all nodes (if Platform cluster)
    Refer to below command which needs to be execute on Platform Node1
    ubuntu@platform1:~$ rdb
    arkin> indexer_status
    cid|                      Bookmark|  Lag(sec)
    10530|  04 Sep 2025 14:21:51.306 UTC|   1206165
    
    ubuntu@platform2:~$ rdb
    arkin> indexer_status
    cid|                      Bookmark|  Lag(sec)
    10530|  04 Sep 2025 14:21:51.306 UTC|   1206225
    
    ubuntu@platform3:~$ rdb
    arkin> indexer_status
    cid|                      Bookmark|  Lag(sec)
    10530|  04 Sep 2025 14:21:51.306 UTC|   1206232
    Note: Please note these values are in seconds and need to be converted to hours then days / weeks to verify the time in the UI  ex:  1206165 seconds = 335 hours = 1.99 weeks )

  4. On Platform node1 review the hadoop-yarn/container logs and drill down to taskmanager.log file  location below:

    /var/log/arkin/hadoop-yarn/containers/application_XXXXXXXXXXXXX_0002/container_e64_XXXXXXXXXXXXX_0002_01_000002/ taskmanager.log

    Note: If this is a Cluster then you need to look at each platform node in the Cluster for the above error in taskmanager.log file.

  5. Once the taskmanager.log has been found you can search for the following string ERROR fdb.helpers.Chunker to verify the error entry below:
    2025-09-18T19:21:18.266Z ERROR fdb.helpers.Chunker fdb-config-store-exec-111 stitchChunks:90 could not find chunk at index 0_ numChunks=1 keys=[] returning nullvalue
    2025-09-18T19:21:18.266Z ERROR fdb.stores.FdbKvStore fdb-config-store-exec-111 lambda__getKey_23:384 __r___fetchChunks: cid=0 key=TimedConfigIndexer:BookmarkToken:127 chunkPtr=_x00_x00_xa1_xa9Y<_x07__x00_x00_x00_x00 numChunks=1 caller_id=KvGetWithVersion
    com.vnera.storage.config.fdb.helpers.Chunker_ChunkStitchException: missing chunk piece: 0
            at com.vnera.storage.config.fdb.helpers.Chunker.stitchChunks(Chunker.java:91) _[blob_p-b22f7056cc377095ba4eee18a467324e7c1cb01f-c4ef19012f8cae5f21abd0ae3a1184b4:_]
            at com.vnera.storage.config.fdb.stores.FdbKvStore.lambda__getKey_23(FdbKvStore.java:380) _[blob_p-b22f7056cc377095ba4eee18a467324e7c1cb01f-c4ef19012f8cae5f21abd0ae3a1184b4:_]
            at java.util.concurrent.CompletableFuture_UniApply.tryFire(CompletableFuture.java:642) [_:_]
            at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) [_:_]
            at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) [_:_]
            at com.apple.foundationdb.async.AsyncUtil_LoopPartial.apply(AsyncUtil.java:350) [blob_p-b22f7056cc377095ba4eee18a467324e7c1cb01f-c4ef19012f8cae5f21abd0ae3a1184b4:_]
            at com.apple.foundationdb.async.AsyncUtil_LoopPartial.apply(AsyncUtil.java:332) [blob_p-b22f7056cc377095ba4eee18a467324e7c1cb01f-c4ef19012f8cae5f21abd0ae3a1184b4:_]
            at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) [_:_]
            at java.util.concurrent.CompletableFuture_UniHandle.tryFire(CompletableFuture.java:907) [_:_]
            at java.util.concurrent.CompletableFuture_Completion.run(CompletableFuture.java:478) [_:_]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [_:_]
            at java.util.concurrent.ThreadPoolExecutor_Worker.run(ThreadPoolExecutor.java:628) [_:_]
            at java.lang.Thread.run(Thread.java:829) [_:_]
    2025-09-18T19:21:18.267Z INFO common.utils.Tracer Source: SDMProcessSRC -> GenSDM -> Filter -> MetStoreMap -> (Sink: RAW_METRIC_SINK, Sink: FlinkKafkaProducer, async wait operator-> Timestamps/Watermarks -> Flat Map, Filter -> Map) (15/18)_0 tee:41 FlowStoreProgram: start. num_triggers=6, collector_id=-1925436150 wall_time=1758223278267
    2025-09-18T19:21:18.267Z INFO storage.utils.ConfigStoreUtils flow-store-program-exec-29 storeDenormObject:626 STORED DENORM, key=10530:515:4588072929907821708, type=515, property _denorm

 

 

Environment

Aria Operations for Networks 6.13.0
Aria Operations for Networks 6.14.0

Cause

When the Platform nodes were restored from snapshots the Indexer Service was not able to get value for key TimedConfigIndexer:BookmarkToken:127 and as a result the Indexer Lag increases daily.


Resolution

Workaround is available for this issue if you see the symptoms mentioned in Issue/Introduction section

Contact Broadcom Support by opening support case to obtain assistance with increasing Indexer Lag. Generate and upload a support bundle for each Platform node.

 For more information, see Creating and managing Broadcom support cases.