Users may experience slowness in loading the workload overview page in the TAP Developer Portal and this issue has a few possible reasons.
This article aims to introduce one of the possible reasons - slowness is caused by timing out to run cluster(s).
When we start investigating performance issue related to the TAP Developer Portal, we usually collect the following artifacts first.
TAP GUI backstage server log. Run the following command:
$ kubectl logs deployment/server -n tap-gui --all-containers
In a scenario where the TAP Developer Portal loading slowness is caused by a particular RUN cluster, it would be possible to observe 504 gateway timeout error against that cluster.
$ cat ../envoy-proxy-pod-log.txt| grep "/api/kubernetes/proxy/apis/carto.run/v1alpha1/deliverables" | grep 504
"GET /api/kubernetes/proxy/apis/carto.run/v1alpha1/deliverables HTTP/2" 504 UT 0 24 14999 - ...
Here is a breakdown for the above log message:
Then when checking the HAR file, it would be possible to see the same 504 error similar to the following.
{
"_connectionId": "1713",
"_initiator": {
"type": "script",
"stack": {
"callFrames": [
{
"functionName": "proxy",
"scriptId": "82",
"url": "https://TAP-GUI-FQDN/static/module-backstage.cf5c2313.js",
"lineNumber": 62,
"columnNumber": 37808
}
],
"parent": {
"description": "await",
},
"_priority": "High",
"_resourceType": "fetch",
"cache": {},
"connection": "443",
"pageref": "page_1",
"request": {
"method": "GET",
"url": "https://TAP-GUI-FQDN/api/kubernetes/proxy/apis/carto.run/v1alpha1/workloads",
"httpVersion": "http/2.0",
"headers": [
{
"name": ":authority",
"value": "TAP-GUI-FQDN"
},
{
"name": ":method",
"value": "GET"
},
{
"name": ":path",
"value": "/api/kubernetes/proxy/apis/carto.run/v1alpha1/workloads"
},
...
{
"name": "backstage-kubernetes-cluster",
"value": "PROBLEMATIC-RUN-CLUSTER"
},
{
"name": "priority",
"value": "u=1, i"
},
"response": {
"status": 504,
"statusText": "",
"httpVersion": "http/2.0",
"headers": [
{
"name": "content-length",
"value": "24"
},
{
"name": "content-type",
"value": "text/plain"
},
{
"name": "date",
"value": "Wed, 29 Jan 2026 05:53:53 GMT"
},
{
"name": "server",
"value": "envoy"
}
],
"cookies": [],
"content": {
"size": 24,
"mimeType": "text/plain",
"text": "upstream request timeout"
...
PROBLEMATIC-RUN-CLUSTER is the RUN cluster name which is the possible root cause of the issue.
# Add the following secret to tap-install ns
---
apiVersion: v1
kind: Secret
metadata:
name: tap-gui-timeout-overlay
namespace: tap-install
stringData:
tap-gui-timeout-overlay.yaml: |
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind": "HTTPProxy", "metadata": {"name": "tap-gui"}}), expects="0+"
---
spec:
routes:
#@overlay/match by=overlay.index(0)
#@overlay/replace
- services:
- name: server
port: 7000
timeoutPolicy:
response: "60s"
idle: "120s"
# Add this to TAP Values
package_overlays:
- name: tap-gui
secrets:
- name: tap-gui-timeout-overlay