vRealize Network Insight 6.30 and above shows "Realtime Stream Processing has failed" error on Platform Nodes in GUI
book
Article ID: 324475
calendar_today
Updated On:
Products
VMware Aria Operations for Networks
Issue/Introduction
This article provides information about the steps to be resolve the error "Realtime Stream Processing has failed" error in vRNI UI.
Symptoms: In vRNI GUI, under Settings > Infrastructure and Support > Overview and Updates, you see the error "Realtime Stream Processing has failed" for one or several platforms.
Starting from vRNI version 6.3 some of the events e.g. events coming from NSX-T datasource are delivered in near real-time. To achieve this goal vRNI uses special mechanism to handle these events which requires that all the vRNI components remain healthy. However sometimes some components e.g. Foundation DB and Kafka may become unhealthy temporarily. There is mechanism in place to account for such temporary conditions and recover from it automatically. This recovery mechanism sometimes ends up into deadlock and therefore the error never disappears.
Resolution
0. If there is no NSX-T datasource added into the vRNI instance, the error can be ignored completely.
1. Make backups of all vRNI platforms following below vRNI documentation :
2. SSH to vRNI platform with the user called support
3. Execute the command ub to switch to the user ubuntu support@platform1:~$ ub ubuntu@platform1:~$
4. List the files that contains the property enableFastPath and change the value from true to false. grep -ni "enablefastpath" /home/ubuntu/build-target/restapilayer/policy/* -C 1
In the example above, we can see there is only one file to modify at : /home/ubuntu/build-target/restapilayer/policy/default-system.configuration And the value has to be changed twice on line 389 and 397. It is possible you may have more than one file to change, if that is the case, please make the change to all the files.
5. Change the property enableFastPath from true to false with these two vi commands : vi +389 /home/ubuntu/build-target/restapilayer/policy/default-system.configuration vi +397 /home/ubuntu/build-target/restapilayer/policy/default-system.configuration
6. Confirmed the value got changed from true to false with : grep -ni "enablefastpath" /home/ubuntu/build-target/restapilayer/policy/* -C 1
7. Restart the service restapilayer-service with sudo service restapilayer-service stop and sudo service restapilayer-service start ubuntu@platform1:~$ sudo service restapilayer-service stop ubuntu@platform1:~$ sudo service restapilayer-service start
The steps from 2 to 7 have to be repeated on all the platforms nodes. The steps from 8 onward have to be repeated only on the platforms nodes where you see the alert "Realtime Stream processing has failed"
8. Execute the following command java -cp .:build-target/common-utils/tools-0.001-SNAPSHOT.jar com.vnera.tools.ServiceHealth print | grep -B 10 true | grep -A 10 STREAM_PROCESSING_FAILED
It should output a node id.
9. Execute the following command java -cp .:build-target/common-utils/tools-0.001-SNAPSHOT.jar com.vnera.tools.ServiceHealth clear --errorCode STREAM_PROCESSING_FAILED --serviceId "" --serviceType SYSTEM --nodeId <nodeId> to clear the alert. Replace <nodeId> with the value found from step 8.
At this point the alert should be cleared in UI.
Additional Information
Impact/Risks:
Since the special mechanism only handles events involving NSX-T datasource as of vRNI version 6.3.0 ,6.4.0, 6.5.0 and 6.5.1. It means those events will not make it to Email/SNMP targets as long as the error is present on the UI. The events will be available in UI but the notifications won't be sent.