After logging into the GUI for VCF Operations for Networks and selecting Settings --> Infrastructure and Updates, under System Health , the Indexer lag shows as high.
You may see an error message such as below, but you also may just see a value that you know to be considerably higher than typical for your environment.
The error message text is "Indexer Service" followed by "Recent data is still being indexed. Search results might be inaccurate." followed by "Resolution: Wait for indexer to catch up and this error to clear. If this persists for more than 12 hours, contact support."
As a double-check, SSH into the Platform1 node with the support user, switch to the Ubuntu user with the ub command, and then invoke the rdb tool to check the Indexer status with the indexer_status command.
ub
rdb
arkin> indexer_status
The Indexer Status shows a high lag in seconds, which corresponds to what the GUI shows (In perhaps a different unit of measurement, such as minutes, hours, or even days)
As an additional check, when you check the service statuses using the ./run_all.sh sudo /home/ubuntu/check-service-health.sh -p -d command while in the ub user state, you observe:
FlinkContainer is running and healthy.
When checking the hosts file, located at /etc/hosts , you see platform-infra on one of the lines:
When checking the file deployment-set.info located at /home/ubuntu/build-target/deployment/ , there is an incorrect "platform-infra," entry prior to the other existing Platforms, such as below:
VCF Operations for Networks
The cause of this issue is still being investigated.
Note: It is recommended to take snapshots prior to making any changes. Please see Best practices to shutdown Aria Operations for Networks Clustered deployments
STEPS:
-rw-r--r-- # root root ### ### ## #### /etc/hostssudo chmod o+w /etc/hostssudo chmod o-w /etc/hostscp /home/ubuntu/build-target/deployment/deployment-set.info /home/ubuntu/build-target/deployment/deployment-set.info.before.kb.editplatform1,platform2,platform3./run_all.sh sudo cat /etc/hosts | grep localhost--platform1--
127.0.0.1 localhost aria-networks-platform
127.0.0.1 aria-networks-platform
127.0.0.1 localhost aria-networks-platform
###.###.###.### platform1
###.###.###.### platform2
###.###.###.### platform3
--platform2--
127.0.0.1 localhost aria-networks-platform
127.0.0.1 aria-networks-platform
127.0.0.1 localhost aria-networks-platform
###.###.###.### platform1
###.###.###.### platform2
###.###.###.### platform3
--platform3--
127.0.0.1 localhost aria-networks-platform
127.0.0.1 aria-networks-platform
127.0.0.1 localhost aria-networks-platform
###.###.###.### platform1
###.###.###.### platform2
###.###.###.### platform3
./run_all.sh sudo cat /home/ubuntu/build-target/deployment/deployment-set.info
--platform1--
platform1,platform2,platform3
--platform2--
platform1,platform2,platform3
--platform3--
platform1,platform2,platform3
4. Restart the Flink service on each Platform Node (There is no "restart", but first you "stop", then "start"):
./run_all.sh sudo systemctl stop flinkjobs.service./run_all.sh sudo systemctl start flinkjobs.service
5. Verify that the services on each Platform Node are running and healthy:
./run_all.sh sudo /home/ubuntu/check-service-health.sh -p -d
At this point, you should begin to see the indexer lag gradually return to more typical values.
You may have to wait hours, or even days, depending on how long the lag was when you began this procedure.