On a three-tier network, we have configured redundant tunnels at the third tier -- in other words, a third-tier tunnel client hub has a tunnel back to two different tunnel servers at the second tier (for High Availability purposes.)
If one of the tunnel servers goes down, sometimes the third-tier hub turns red when viewing it from Admin Console or Infrastructure Manager from the primary hub, even though it is still connected to the infrastructure with another tunnel.
It takes at least 40 minutes (sometimes longer) for the route to be updated, and for the third-tier hub to be seen as green/available again, even though it is still transmitting data up the alternate tunnel the whole time.
Is there a way to make the route update more quickly?
This has to do with the way routes are calculated in UIM - when two different tunnel servers report that they have a route to the same client, the primary hub accepts the information on a "first come, first served" basis - whichever tunnel server submits the report first is seen as the 'correct' route to reach that client.
Hubs submit their hublists to each other on a regular basis - every 10 minutes by default - but when these updates come in from hubs that are at the same "proximity" (number of hops) away from the receiving hub, they are ignored if the hubs on the list are already known to the receiving hub.
A route will not be "replaced" unless it comes from a hub that is closer in proximity to the original route; in this way UIM tries to always find the shortest route between hubs, but when it comes to routes that are equivalent, replacements are not made.
In order for the tunnel route to be updated, two events have to occur, and these events can take time depending on the order in which they occur:
1. The hub in question has to be seen as 'inactive' for a certain amount of time - by default, 30 minutes - at which point it is removed from the hublist;
2. After removal, another hub with a valid route needs to submit a hublist update, at which time the removed hub will be seen as "new" and added back to the list with the new route. This usually happens within 10 minutes.
Additionally, hubs are checked for inactivity only every 15 minutes - meaning that it can take around 45 minutes in some cases for a hub to be recognized as inactive even though the limit is technically 30 minutes.
Due to the way the timers mentioned interact, this can take up to an hour in most environments - the average seems to be about 40 minutes, but can sometimes take even longer in larger environments with a large numer of competing hublist updates.
This is working as designed.