Our customer has a primary hub and a few secondary hubs. After upgrading to 23.4 CU1 he started to update the robots. He is doing this by dropping the robot_update on the hubs in IM. This will update the robots attached to this hub. For the hubs with less amount of robots this works fine. For the hub with over 900 robots this works only partially. 123 robots are updated correctly and the rest are no longer in the IM under the hub. In the controller log of such a robot we see these error messages:
Controller: failed to send alive (async) to hub HUBXXX(<ip_address>) - not found
Controller: failed to send alive to hub HUBXXX(ip_address>) - not found
We have restored a backup of the robot.sds file on the secondary hub and restarted. After this, all robots are listed under the hub in IM but many of them still with robot version 9.39 and older.
Is this a known problem? I cannot reproduce this problem because I donĀ“t have such a large number of robots.
Workaround:
1. Enable Groups node in IM
https://knowledge.broadcom.com/external/article/10343/infrastructure-manager-groups-node-how-t.html
2. Set up 100 robots in an Infrastructure Group. The safe number within IM when using distsrv to deploy robot_update is under 100 updates at a time.
3. Deploy the robot_update
4. Repeat the process until completed.
If the older robots like 7.80 fail to accept the update you should see some errors in the distsrv log when set to loglevel 5 and logsize 200000.