Collecting support bundle of one of the Manager node fails when you try collecting support bundles of all the 3 Manager nodes at once
search cancel

Collecting support bundle of one of the Manager node fails when you try collecting support bundles of all the 3 Manager nodes at once

book

Article ID: 373621

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • You select all the 3 Manager nodes when collecting support bundles.
  • The support bundle of one Manager nodes is not collected due to an error "An error occurred while uploading the support bundle"
  • The support bundles of other 2 Manager nodes are collected successfully.
  • The support bundle was successfully generated on the failed Manager node, but upload failed.
    You see logs like below on the failed node.
    /var/log/support_bundle.log
    <Timestamp> 12000 napi.root.administration.support_bundles.common_node INFO Compressing bundle archive /image/nsx_bundle_<UUID>/nsx_manager_<node UUID>_YYYYMMDD_HHMMSS.tgz
    <Timestamp> 12000 root INFO Done creating local support bundle archive (size: <File size in byte>)
    <Timestamp> 12000 root INFO Successfully unlocked /tmp/.logrotate.lock
    <Timestamp> 12000 root INFO Support bundle saved to: /image/nsx_bundle_<UUID>/nsx_manager_<node UUID>_YYYYMMDD_HHMMSS.tgz

    /var/log/syslog
    <Timestamp> <Hostname> NSX 30628 - [nsx@6876 comp="nsx-manager" subcomp="sbundle-upload" username="nsx-opsagent" level="INFO"] Successfully connected to manager with IP <NSX Manager IP> and verified thumbprint.
    <Timestamp> <Hostname> NSX 30628 - [nsx@6876 comp="nsx-manager" subcomp="sbundle-upload" username="nsx-opsagent" level="INFO"] Attempting to upload the support-bundle
    <Timestamp> <Hostname> NSX 30628 - [nsx@6876 comp="nsx-manager" subcomp="sbundle-upload" username="nsx-opsagent" level="WARNING"] Upload to manager failed: Traceback (most recent call last):#012  File "/opt/vmware/nsx-opsagent/libexec/sbundle-upload.py", line 194, in _upload_to_manager#012    conn.request("POST", path, fo, headers)#012  File "/usr/lib/python3.6/http/client.py", line 1264, in request#012    self._send_request(method, url, body, headers, encode_chunked)#012  File "/usr/lib/python3.6/http/client.py", line 1310, in _send_request#012    self.endheaders(body, encode_chunked=encode_chunked)#012  File "/usr/lib/python3.6/http/client.py", line 1259, in endheaders#012    self._send_output(message_body, encode_chunked=encode_chunked)#012  File "/usr/lib/python3.6/http/client.py", line 1077, in _send_output#012    self.send(chunk)#012  File "/usr/lib/python3.6/http/client.py", line 998, in send#012    self.sock.sendall(data)#012  File "/usr/lib/python3.6/ssl.py", line 975, in sendall#012    v = self.send(byte_view[count:])#012  File "/usr/lib/python3.6/ssl.py", line 944, in send#012    return self._sslobj.write(data)#012  File "/usr/lib/python3.6/ssl.py", line 642, in write#012    return self._sslobj.write(data)#012socket.timeout: The write operation timed out#012
  • On the NSX Manager you operated, you see "invalid status: UPLOAD_FAILED" in /var/log/nvpapi/api_server.log
    <Timestamp> napi.root.administration.support_bundles.__self__ INFO Support bundle import for generate id: <ID> failed
    <Timestamp> napi.root.administration.support_bundles.__self__ INFO Remote node requesting support bundle import for generate id: <ID>
    <Timestamp> napi.root.administration.support_bundles.__self__ ERROR Generate id <ID> has invalid status: UPLOAD_FAILED
    <Timestamp> napi.root.administration.support_bundles.__self__ INFO Remote node <Node UUID> reported upload failed status

Environment

VMware NSX-T 3.x

VMware NSX 4.0

Cause

It is a known issue in VMware NSX-T and NSX.

When collecting support bundles of all the 3 NSX Manager nodes, you invoke collecting support bundles on one of the Manage nodes.
The support bundles of the other 2 Manager nodes are generated on each node and uploaded to the Manager node you operate.
When one of the Manager node uploads the support bundle while upload from another node is ongoing, the upload that starts later fails and is not retried.

Resolution

The issue is resolved in NSX-T 3.2.2 and NSX 4.0.1.1.

To workaround the issue, collect the support bundle of the failed node again without any other node.