Description:
ISSUE
MTP does not collect any data and the nqMetricReader process fails with the following message:
20:58:52 FetchProbeIdFromIPAddresses: web service call failed with status 28,
url: http://99.99.99.99/SuperAgentInternal/SAConfigWS.asmx
20:58:52 hr_test failure 0xffffffff at saConfigInterface.cpp 122
20:58:52 hr_test failure 0xffffffff at saConfigInterface.cpp 490
20:58:52 hr_test failure 0xffffffff at saConfigInterface.cpp 612
20:58:52 hr_test failure 0xffffffff at SACollector.cpp 287
20:58:52 ********** nqMetricReader shutting down **********
20:58:52 nqMetricReader - collector program did not initialize (xffffffff),
will retry in 60 seconds.
DETAILS
The "web service call failed with status 28" is a SOAP TCP error, essentially the FetchProbeIdFromIPAddress()
web service method is not reaching the SA Console due to (what it thinks) is a network connectivity issue.
Solution:
TROUBLESHOOTING
Windows Application Event Logs can be very helpful in seeing what is occurring with Console Web Services. The NetQoS logs just don't have much of interest in them.
This type of issue can be either:
- A network connectivity issue from MTP -> SA Console
The SOAP response code on MTP side of "28" makes this seem most likely.
- The IIS Web Services on the SA Console (SuperAgentInteral) are not working properly.
Again, errors of interest would likely be in the Windows Application Event Logs. There is also something to try and test basic Web Service connectivity from the command line on MTP. See below.
This determines if the "SuperAgentInternal/SAConfigWS" web service is available from MTP, which is required for nqmetricd startup.
From MTP command line logged in as "netqos", if the SA Console IP is "99.99.99.99":
- To test if the HTTP Port 80 is reachable at the SA Console:
telnet 99.99.99.99 80
- A successful TCP connect would return:
Trying 99.99.99.99...
Connected to <hostname> (99.99.99.99).
Escape character is '^]'.
Connection closed by foreign host.
To disconnect enter CTRL-C to kill the telnet session.
- If the Console IP is reachable, but Port 80 is not available:
Trying 99.99.99.99...
telnet: connect to address 99.99.99.99: Connection refused
telnet: Unable to connect to remote host: Connection refused
- If the Console IP is not reachable, it may hang for a few minutes after "Trying":
Trying 99.99.99.99....
telnet: connect to address 99.99.99.99: Connection timed out
telnet: Unable to connect to remote host: Connection timed out
- To determine if the Console "SuperAgentInternal/SAConfigWS" web service is reachable.
"wget" is a command line web browser utility on Linux:
sudo wget -v http://99.99.99.99/SuperAgentInternal/SAConfigWS.asmx?wsdl
- A successful response would be:
--12:17:03-- http://99.99.99.99/SuperAgentInternal/SAConfigWS.asmx?wsdl
Connecting to 99.99.99.99:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 35687 (35K) [text/xml]
Saving to: `SAConfigWS.asmx?wsdl'
100%[=======================================>] 35,687 --.-K/s in 0.03s
12:17:03 (1.10 MB/s) - `SAConfigWS.asmx?wsdl' saved [35687/35687]
- A failure response would again hang for a while:
--12:21:03-- http://99.99.99.99/SuperAgentInternal/SAConfigWS.asmx?wsdl
Connecting to 99.99.99.99:80... failed: Connection timed out.
Retrying.
--12:24:13-- (try: 2) http://99.99.99.99/SuperAgentInternal/SAConfigWS.asmx?wsdl
It retries a few times, but you can hit CTRL-C to exit at anytime.