Devices polled and managed via SPECTRUM SDM / SDC show an intermittent condition while the SDM/SDC communication hangs
The Secure Domain Manager model and also the SDConnector models are not showing Events, with an exception generated while all directly managed SPECTRUM devices are polled successfully.
Creating a sniffer trace for the SDC instance does not show any ICMP and/or SNMP traffic to poll the network devices.
This is a network lower layer configuration issue for fragmentation errors caused by MTU size set to high values.
Enabling trace/debug in the $SPECROOT/SDM/sdm.config by adding the option "debug" results in messages similar to the following in the sdmLog.log:
Wed Sep 23 10:38:45 2009 : WARNING: SdmEtpkiConnectEndpoint run() ssock_handshake error. IP=172.16.2.102, Port=6844, Thread=99
Wed Sep 23 10:38:50 2009 : WARNING: SdmEtpkiConnectEndpoint run() ssock_handshake error. IP=172.16.2.102, Port=6844, Thread=99
Wed Sep 23 10:38:55 2009 : WARNING: SdmEtpkiConnectEndpoint run() ssock_handshake error. IP=172.16.2.102, Port=6844, Thread=99
Wed Sep 23 10:39:00 2009 : WARNING: SdmEtpkiConnectEndpoint run() ssock_handshake error. IP=172.16.2.102, Port=6844, Thread=99
Wed Oct 7 15:37:39 2009 : WARNING: Endpoint 172.16.2.102 is shutting down after keepalive timeout
Wed Oct 7 15:37:39 2009 : WARNING: SdmEtpkiEndpoint::shutdownSocket() starting. IP=172.16.2.102, Thread=211
Wed Oct 7 15:37:40 2009 : WARNING: socket closed. IP=172.16.2.102, Thread=211
Wed Oct 7 15:37:40 2009 : WARNING: socket is invalid. IP=172.16.2.102, Thread=211
While checking the SpectroSERVER machine's TCP connections via netstat -s shows IP statistics are fine with no IP fragmentation and/or re-assembly or fragmentation lost data, however, checking the remote SDC machine netstat -s IP statistics shows a large number for IP counters for the fragmentation statistics.
Also, checking the SDM/SDC valid configuration will, when the SDM is started and the SDC is available, show a successfully established TCP session to port 6844 for the SDM/SDC service port. However, observing a successful TCP session establishment is NOT proof of full operational TCP communication.
For example, Linux/Solaris netstat -s shows fragmentation parameter as follows:
IPv4 ipForwarding = 2 ipDefaultTTL = 255
ipInReceives = 7236012 ipInHdrErrors = 0
ipReasmOKs = 0 ipReasmFails = 0
ipReasmDuplicates = 0 ipReasmPartDups = 0
ipFragOKs = 0 ipFragFails = 0
ipFragCreates = 0 ipRoutingDiscards = 0
Received Header Errors = 0
Received Address Errors = 11392
Received Packets Discarded = 46
Received Packets Delivered = 1375099
Routing Discards = 0
Reassembly Required = 50
Reassembly Successful = 25
Reassembly Failures = 0
Datagrams Successfully Fragmented = 14
Datagrams Failing Fragmentation = 0
Fragments Created = 28
Spectrum 9.4.x, 10.x
In this case, lowering MTU to 1400 byte fixes the SDM/SDC hang condition.
So reduce the IP MTU size configuration for the SpectroSERVER and the SDC machines by reconfiguring to 1400 bytes. The default for Ethernet is 1500 bytes which is maybe causing trouble in case of using VPN/IPsec tunnels with additional encapsulation.
(Legacy KB ID CNC TS33514 )