ALERT: Some images may not load properly within the Knowledge Base Article. If you see a broken image, please right-click and select 'Open image in a new tab'. We apologize for this inconvenience.

UIM robot can send data to hub but hub cannot access its gui configuration

book

Article ID: 188093

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

Between hub and linux server destination agent have a network firewall. Have opened the firewall configuration between it and installed the agent in server destination, but have found an issue there. The robot suddenly dissapears from IM gui after around 10 minutes but data is still updated in UMP. 

Restart of the robot service is required to get the robot visible in IM

There are no firewalld, no iptables and no seLinux enabled on the target linux server 

Also cannot access the robot configuration on IM GUI and get the below error when accessing any probe gui on this robot 

Controller probe (Unable to reach controller: communication error) 

Environment

Release : 9.2.0

Component : UIM - ROBOT

Resolution


1) In robot.cfg the autoremove parameter was enabled
autoremove = yes

Due to which single communication error cause robot to be removed from the hub.

Make  it "no" or remove it so the robot is still visible in IM even if any communication error occurs.


2) To improve the resiliency of communication explore below timers

By adjusting (increasing) the parameter robot_status_check_interval and robot_failover_count, the (unexpected) robot failover behavior can be altered.
 
Can start with these increased values in target linux server robot.cfg and then restart the robot (add these in robot.cfg if missing)

For robot/hub > 7.91
 
robot_status_check_interval = 120
robot_failover_count = 5
reuse_async_session = 1
 

robot_status_check_interval
Robot failover detection prevents unnecessary robot failovers when the hub misses a single response to a robot request (_status, alive, robotup, or probelist).
The robot is more tolerant of incomplete _status requests. Two new properties can be specified in robot.cfg.
robot_status_check_interval controls how frequently the robot polls the hub with a _status request. Previously, this was not configurable, 
and the polling occurred at a medium timeout, approximately every 11 seconds. Use this property to approximately specify the new polling interval.
The interval is approximate because the status check occurs in the medium timeout that exceeds the requested interval. For example,
if the robot_status_check_interval is set to 30 seconds, polling will occur every 33 seconds (3 x 11 second medium timeout). The default value is 60 seconds, which equates to a status check every 66 seconds.

robot_failover_count
robot_failover_count improves the resiliency of the robot. In the past, the robot failover mechanism was invoked when one _status request to the hub failed. 
Use this property to specify the number of consecutive _status failures that must occur before initiating robot failover to a secondary hub. Default, 2.
 
reuse_async_session 

This parameter is availble by default in robot 7.96 and higher

probe_config_get callback fails every other time. To implement the fix, add the new key reuse_async_session = 1 to the controller

Additional Information

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/it-operations-management/ca-unified-infrastructure-management-probes/GA/alphabetical-probe-articles/controller/controller-release-notes.html

Controller probe (Unable to reach controller: communication error)
https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=741206

KB:Linux robot communication error - Unable to reach controller, error message communication error
https://knowledge.broadcom.com/external/article?articleId=34342

KB:Cannot deploy probes to a robot
https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=34339

KB:Robot controller communication problems after server updated, Domain updated
https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=5107

KB:error upon trying to open the controller
https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=34328

KB:Robot connectivity issues on Redhat Linux 7.1
https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=35304
 
KB:How to configure a robot that is behind NAT so it can talk to its hub
https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=107934