NSX-T Search stops functioning after upgrading from 3.2.2 to 4.2.1.1
search cancel

NSX-T Search stops functioning after upgrading from 3.2.2 to 4.2.1.1

book

Article ID: 384779

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

NSX-T Data Center 3.2.2.* has been upgraded to NSX 4.0.x or 4.1.x or 4.2.x

The upgrade completed successfully, but within the UI nothing will index.
/var/log/search/opensearch.log

2024-12-21T18:13:34.004Z ERROR SearchHealthCheck SearchServiceWatchDog 77041 - Insx@6876 comp="nsx-manager" errorCode="MP60524" level="ERROR" subcomp="manager"] [Search: Wat A chDog] Could not connect to OpenSearch java.net.ConnectException: Connection refused
at org.opensearch.client.RestClient.extractAndWrapCause (RestClient.java:954) ~[?:?]
at org.opensearch.client.RestClient.performRequest (RestClient.java:333) ~[?:?]
at org.opensearch.client.RestClient.performRequest (RestClient.java:321) ~[?:?]
at org.opensearch.client.RestHighLevelClient.internalPerformRequest (RestHighLevelClient.java:1918) ~[?:?]
at org.opensearch.client.RestHighLevelClient.performRequest (RestHighLevelClient.java:1901) ~[?:?]
at org.opensearch.client.RestHighLevelClient.ping(RestHighLevelClient.java:688) ~[?:?]
at com.vmware.nsx.management.search.manager.SearchServiceWatchDog.isOpenSearchHealthy (SearchServiceWatchDog.java:185) ~[?:?]
at com.vmware.nsx.management.search. manager. SearchServiceWatchDog.run(SearchServiceWatchDog.java:85) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
at java.util.concurrent.FutureTask.runAndReset(Unknown Source) ~[?:?1
at java.util.concurrent. ScheduledThreadPoolExecutor$ beduledFutureTask.run(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker (Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at com.vmware.nsx.util.concurrent.Executors$MeteredRunnable.run(Executors.java:353) Insx-util.jar:?]
at java.lang.Thread.run(Unknown Source) [?:?1
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect (Native Method) ~[?:?]
at sun.nio.ch.SocketChannelImpl.finishConnect (Unknown Source) ~[?:?]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent (DefaultConnectingIOReactor.java:171) ~[?:?]
at
PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221) ~[?:?]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) ~[?:?] at org.apache.http.impl.nio.reactor. Abstract Multiworker IOReactor.execute(Abstract Multiworker IÕReactor.java:351) ~[?:?] org.apache.http.impl.nio.conn. at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[?:?1 1 more

Environment

VMware NSX-T Data Center
VMware NSX

Cause

During the upgrade process Elasticsearch JVM isn't able to stop in a timely manner and is eventually terminated after the upgrade process tries to start Opensearch. It prevents an update to user permissions for the /nonconfig/search folder from elasticsearch to nsx-search user which causes Opensearch to fail upon start.

Resolution

Workaround:
This workaround needs to be applied on all NSX Managers in the impacted cluster:

  1. SSH to NSX Managers as root.
  2. Change directory to /nonconfig/search:

    cd /nonconfig/search

  3. List content of the directory:

    ls -l

    Note: The output will be similar to:

    # ls -l
    drwxr-x--- 3 elasticsearch elasticsearch 4096 Apr 25  2023 nodes
    drwxr-xr-x 2 nsx-search    nsx-search    4096 Jan  4 13:13 tmp

  4. Change the ownership of the content in /nonconfig/search to owner and group "nsx-search" on all NSX Managers in the cluster:

    chown -R nsx-search:nsx-search nodes
  5. Run "systemctl start search"
  6. switch to admin
    su admin
  7. run the commands below
    start search resync recovery (if your on version 4.2.0 or newer)
    start search resync all
  8. The NSX managers will index for a few minutes depending on the environment size, and then you will be able to search via the ui again.