NSX-T Load Balancer Returns "502 Bad Gateway" due to higher Header Size
search cancel

NSX-T Load Balancer Returns "502 Bad Gateway" due to higher Header Size

book

Article ID: 434755

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Clients are unable to access websites or applications hosted behind an NSX-T Native Load Balancer.
  • The browser displays an HTTP 502 Bad Gateway error.
  • The issue may be persistent or intermittent during periods of high traffic.
  • SSH into the active Edge node (edge-248) as root and check logs under :  /var/log/lb/<lb-uuid>/logs/

     

  • less error.log | grep -i " 502 Bad Gateway"

    <timestamp> [debug] 669188#0: *11066929 HTTP/1.1 502 Bad Gateway
    <timestamp> [debug] 669188#0: *11066933 HTTP/1.1 502 Bad Gateway
    <timestamp> [debug] 669188#0: *11066949 HTTP/1.1 502 Bad Gateway
    <timestamp> [debug] 669188#0: *11066929 HTTP/1.1 502 Bad Gateway
  • error.log [Full log] showing high HttpResponse.Size: '174265'

    <timestamps> lb 669188 access [INFO] [21319142-####-####-####-7729867055d3][7f031050-####-####-####-e2857a550392] Operation.Category: 'LbAccessLog', Operation.Type: 'Http', Lb.UUID: '21319142-####-####-####-7729867055d3', Lb.Name: 'LBService - Edge-<name>-1ebe31ea-####-####-####-42e5483ef4ef', Vs.UUID: '7f031050-####-####-####-e2857a550392', Vs.Name: 'Edge-<name>-1ebe31ea-####-####-####-42e5483ef4ef - SIT_WWW_8021', Vs.Ip: '##.##.95.217', Vs.Port: '443', Pool.UUID: '5ffa839a-####-####-####-e4cc68793fa9', Pool.Name: 'Edge-<name>-1ebe31ea-####-####-####-42e5483ef4ef -SIT_WWW_8021', PoolMember.Ip: '##.##.95.63', PoolMember.Port: '8021', Client.Ip: '##.##.20.88', Client.Port: '62800', Snat.Ip: '##.##.95.59', Snat.Port: '36312', HttpRequest.Method: 'GET', HttpRequest.UserAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.62', HttpRequest.X-Fwd-For: '-', HttpRequest.Uri: '/olacp//assets/img/branding/NUS%20logo%20full%20colour%20[Converted]-01.png', HttpRequest.Host: '<web-page-address>', HttpResponse.Status: '200', HttpResponse.StatusCategory: '2xx', HttpResponse.Size: '174265', HttpResponse.ServerTime: '0.014', HttpResponse.TotalTime: '0.014', Error.Reason: '-'

     

  • The below shows the HttpResponse.Size sent to NSX LB. 
    less error.log | grep -i HttpResponse.Size | awk '{print $1,$71,$72}' | sort -rh | head -n 10    

    <timestamps> HttpResponse.Size: '174265',
    <timestamps> HttpResponse.Size: '913795',
    <timestamps> HttpResponse.Size: '288102',

     

  • less error.log | grep "upstream sent too big header while reading response header from upstream"

    <timestamps> [error] 669188#0: *11066929 upstream sent too big header while reading response header from upstream, client: <client-ip>, server: , request: "POST /olacp/auth HTTP/1.1", upstream: "https://<ip-address>:<port>/olacp/auth", host: "<web-page-address>", referrer: "https://<url>/"
    <timestamps> [error] 669188#0: *11066933 upstream sent too big header while reading response header from upstream, client: <client-ip>, server: , request: "POST /olacp/auth HTTP/1.1", upstream: "https://<ip-address>:<port>/olacp/auth", host: "<web-page-address>", referrer: "https://<url>/"

     

Environment

NSX Load Balancer
VMware NSX

Cause

The 502 Bad Gateway error in NSX-T L7 Load Balancers is typically caused by the backend server sending an HTTP response header that exceeds the default 4096-byte limit.

Resolution

To resolve this issue, customize the LB based on the HTTP response header size sent by the application:

  • If the LB entity is created using Policy, create an HTTP profile at policy UI with higher response header size. Max Supported 65536
    • Path: Networking--> Load Balancer--> Profiles--> Add Application profile --> HTTP.
    • Apply this HTTP profile to the VIP on the "Virtual Servers" page.
  • If the LB entity is created at MP (either via UI or NCP), create an HTTP profile via the Policy UI and attach the newly created profile to the VIP in the MP UI.

Alternatively, change the L7 LB to an L4 LB in order to get the affected application working as well. This will be due to the custom Application Profiles no longer being applied.

Note: This is an application-level issue caused by header response sizes changing after application upgrades, and is not limited to NCP.