Server pool in Degraded state after upgrading to NSX 4.2.x (SSL or TCP Handshake Timeout error as per Edge logs)
search cancel

Server pool in Degraded state after upgrading to NSX 4.2.x (SSL or TCP Handshake Timeout error as per Edge logs)

book

Article ID: 395184

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

NSX native load balancer issues after NSX upgrade to 4.2.1.3:

- Virtual server in degraded state

 

- Server pool in Degraded state

- The pool members are UP and running but the TCP connection fails for the requests coming towards the pool members

- From Edge logs we can see the SSL or TCP Handshake timeout errors:

var/log/syslog
28850:2025-04-13T03:20:14.409Z NSX 26784 LOAD-BALANCER [nsx@6876 comp="nsx-edge" subcomp="lb" s2comp="lb" level="WARN"] [######-######-#######] Operation.Category: 'LbEvent', Operation.Type: 'StatusChange', Obj.Type: 'PoolMember', Obj.Ip: '#.#.#.#', Obj.Port: '443', Pool.UUID: '######-######-#######', Pool.Name: 'test-server-pool', Lb.UUID: '######-######-#######', Lb.Name: 'Test-nsxt-lb', Vs.UUID: '######-######-#######', Vs.Name: 'test-https', Status.NewStatus: 'Down', Status.Msg: 'SSL Handshake Timeout'

1939:2025-04-13T07:05:15.703Z NSX 154168 LOAD-BALANCER [nsx@6876 comp="nsx-edge" subcomp="lb" s2comp="lb" level="WARN"] [######-######-#######] Operation.Category: 'LbEvent', Operation.Type: 'StatusChange', Obj.Type: 'PoolMember', Obj.Ip: '#.#.#.#', Obj.Port: '443', Pool.UUID: '######-######-#######', Pool.Name: 'test-server-pool', Lb.UUID: '######-######-#######', Lb.Name: 'Test-nsx-lb', Vs.UUID: '######-######-#######', Vs.Name: 'test-https', Status.NewStatus: 'Down', Status.Msg: 'TCP Handshake Timeout'

Environment

VMware NSX

Cause

An SSL handshake timeout with the NSX load balancer typically indicates a failure to establish a secure TLS/SSL connection between a client and the load balancer. A TCP handshake timeout due to TLS in an NSX load balancer is often caused by a TLS version mismatch, where the client and server are not mutually supporting the same TLS protocol version. This can happen if the client attempts to use an older TLS version that the NSX load balancer doesn't support, or vice versa. Here the negotiation of TLS 1.1 is being performed in customer's environment but the TLS v1.1 is disabled by default in NSX 4.2 and hence failing with handshake timeouts.

Resolution

Method1: Using Postman

TLS v1.1 can enabled by API call.

1. Run the following GET API to read the configuration of the NSX API service:
   GET https://<NSX-Manager-IP>/api/v1/cluster/api-service
   The API response contains the list of cipher suites and TLS protocols.

2. Enable the TLS 1.1 protocol.
    Set TLSv1.1 to enabled = true
    Run the following PUT API (include the previous body along with changes) to send the changes to the NSX API server:
    PUT https://<NSX-Manager-IP>/api/v1/cluster/api-service

 
Method2: Using the Curl

Alternatively you can use a command line to issue the PUT using the curl command.
1. Save the BODY of the GET call to a file.
curl -k -X GET -u admin https://[NSX-Manager-IP]/api/v1/cluster/api-service > filename.json

2. Make any desired modifications to the data in the file and do a PUT:
curl -k -u admin -X PUT -H "Content-Type: application/json" -H "X-Allow-Overwrite: true" -d "@path/to/body/file" https://[manager-ip]/api/v1/cluster/api-service

3. Your output should show your body with the changes you made.
Confirm the change by running the GET API call from above.

After enabling the TLS v1.1 no more timeouts are seen and the serverpool and virtual server status are back to Success state