This scenario was first noticed when the data_engine queue turned yellow and started building up.
No data was making into the database and the data_engine probe sent an error that it's queue was building up.
In the data_engine GUI, we see 0 msg/min (0 msg/sec) and the 'test connection' button fails.
We tested telnet from the primary server to the database server over port 1433.
This fails when the UIM service is running on the primary, but succeeds when the service is disabled.
The netstat command (netstat -ano) shows a SYN_SENT (but no ACK):
TCP 192.168.0.1:50994 192.168.0.1:1433 SYN_SENT 3288
This problem turned out to be caused by TCP exhaustion - a connection limit threshold on the primary server.
Data_engine could not complete a TCP handshake, getting stuck at SYN_SENT.
After raising the values of the corresponding registry keys, we see data_engine connect successfully.
Set/create the following key/value pairs in the registry at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters:
MaxUserPort = 65534
TcpTimedWaitDelay = 30
TcpNumConnections = 16777214
TcpMaxDataRetransmissions = 5
These values will vary - you may have to test implementing different values in your environment.
Note that the highest possible value for MaxUserPort is 65534. Setting it to the maximum value may have unexpected results.