We are seeing 503 errors while accessing UI and while engine talking to app. Please investigate.
Release :
Component : BLAZEMETER PERFORMANCE TESTING
The bzt.logs, show an occasional poll requests return a 503 - Service Unavailable error. We are not seeing every poll request fail, which indicates more of a network-related issue.
For example:
[2020-12-04 10:44:42,508 DEBUG Engine.cloud-utils] Got command from server: {'cmd': None}
[2020-12-04 10:44:42,508 DEBUG Engine.cloud-utils] Long polling, request commands from (sleep 1): https://blazemeter.<domain>.
[2020-12-04 10:44:42,508 DEBUG Engine.cloud-utils] Request: POST https://blazemeter.<domain>.
"token": "
}
[2020-12-04 10:45:23,311 DEBUG urllib3.connectionpool] https://blazemeter.<domain>.
[2020-12-04 10:45:23,312 DEBUG Engine.cloud-utils] Response [200]: {
"cmd": null
}
[2020-12-04 10:45:23,312 DEBUG Engine.cloud-utils] Got command from server: {'cmd': None}
[2020-12-04 10:45:23,312 DEBUG Engine.cloud-utils] Long polling, request commands from (sleep 1): https://blazemeter.<domain>.
[2020-12-04 10:45:23,312 DEBUG Engine.cloud-utils] Request: POST https://blazemeter.<domain>.
"token": "
}
[2020-12-04 10:45:28,774 DEBUG urllib3.connectionpool] https://blazemeter.<domain>.
[2020-12-04 10:45:28,775 DEBUG Engine.cloud-utils] Response [503]: upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2020-12-04 10:45:28,776 INFO Engine.cloud-utils] Failed long polling request (sleep 1): API call error https://blazemeter.<domain>.
Traceback (most recent call last):
File "/usr/local/taurus-cloud/
result = json.loads(resp) if len(resp) else {}
File "/usr/lib/python3.6/json/__
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/taurus-cloud/
res = self._request(self.url, data={'token': self.session_token})
File "/usr/local/taurus-cloud/
raise TaurusNetworkError("API call error %s: %s %s" % (url, response.status_code, response.reason))
bzt.TaurusNetworkError: API call error https://blazemeter.<domain>.
[2020-12-04 10:45:29,779 DEBUG Engine.cloud-utils] Long polling, request commands from (sleep 1.5): https://blazemeter.<domain>.
[2020-12-04 10:45:29,779 DEBUG Engine.cloud-utils] Request: POST https://blazemeter.<domain>.
"token": "
}
[2020-12-04 10:46:10,667 DEBUG urllib3.connectionpool] https://blazemeter.<domain>.
[2020-12-04 10:46:10,668 DEBUG Engine.cloud-utils] Response [200]: {
"cmd": null
}
There could be numerous things causing the occasional 503 errors, such as network interruptions, the backend service (which bzm-app is pinging/polling) becomes too busy to respond to the request, or the requests are timing out before the reply is received.
Broadcom Support recommends you work with your Kubernetes admin, and a Network Admin, to look at how your OPL/Private Cloud environment is configured. Specifically check the timeout settings for the bzm-app and or the ingress for your Kubernetes is configured. Try increasing the timeout to see if that helps. Although, we are not able to recommend how much to increase the timeout value.
A packet capture from both ends of the communication, showing the polls going out, and the replies being sent/received will be helpful to determine exactly what is causing the 503 errors, and may help determine the appropriate setting for the timeout values if the requests are occasionally timing out.