VMware Cloud Foundation for Service Providers 2.3 SOS utility crashes while collecting logs for ESXi hosts
search cancel

VMware Cloud Foundation for Service Providers 2.3 SOS utility crashes while collecting logs for ESXi hosts

book

Article ID: 330360

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

Symptoms:
  • The SOS log collection tool is not completing successfully.
  • The SOS  process is terminated with an error message  similar  to     

     ./sos: line 13: 31845 Killed $path $CURRENT_PATH/sos.pyc $@


Cause


This issue occurred when the SOS tool is not able to connect to one of the  VMware Cloud Foundation for Service Providers components (PSC, NSX, vCenter, Esxi,..), due to a network  issue, or the SDDC Manager  isn't able to ssh nor  to run API calls to one of the components.
 
The SOS  tool will keep trying to connect to the unreachable  components, and won't timeout until the SOS process gets killed by the Linux Kernel to prevent the  SDDC Manager VM from running out of memory. 

The kernel will begin killing processes to free up RAM when a linux machine runs low on memory.This is called the OOM Killer.(OOM stands for out of memory). 

Example:
  • The several ESXi Host are showing as unreachable when running the ./sos --Health-Check
---- -+---------------------------+--------------------------+------------------------
| SL# | Area                        | Title                        | State                    |
+-----+---------------------------+--------------------------+------------------------
| 1     | ESXI : 10.63.17.18  | Connectivity status | UNREACHABLE |
| 2    | ESXI : 10.63.8.109   | Connectivity status | UNREACHABLE |
| 3     | ESXI : 10.63.8.99    | Connectivity status | UNREACHABLE |
 
  • Not able to SHH to the hosts due to:
    •  SSH service is disable on the hosts.
    •  Network is down.
    • The Esxi Host root password was change outside of the SDDC Manager.

Resolution

This is currently a known issue affecting VMware Cloud Foundation for Service Providers. This issue is resolved in in VMware Cloud Foundation Service Provider 2.4.





Workaround:
****Please Note: This Workaround is only valid for VMware Cloud Foundation Service Providers version 2.3****

Workaround for  customers  on VMware Cloud Foundation for Service Providers 2.3 and are unable to update to  2.4

To workaround this issue we need to recreate the progressreporter.pyc  file  located on the SDDC Manager  VM  Path: /opt/vmware/sddc-support/utils/progressreporter.pyc ) using the progressreporter.py file  which is attached to this KB.

The new progressreporter.pyc file  will contain the compiled code that prevents the kernel from killing sos utility. The fix  will cause the sos process to timeout and bypath the unreachable component, instead of getting terminated by the  Linux Kernel.


Steps to recreate the progressreporter.pyc file:
 
  1. Run command cp -rp /opt/vmware/sddc-support /tmp    to backup of sddc-support to /tmp
  2. Go to /tmp/sddc-support/utils     cd /tmp/sddc-support/utils
a. Run rm progressreporter.pyc   to remove the progressreporter.pyc file
b. Run vi progressreporter.py  this will open a new file.
c. Click ( i ) to insert the text .
d. Open attached progressreporter.py file on this KB, and copy the content to the new progressreporter.py file.
e. To save the change  press ( ESC ), then type    (   :wq   )  
  1. Run SOS log collection from  tmp directory   /tmp/sddc-support/sos  to test the SOS log collection process.
  2. After first SOS run  the new progressreporter.pyc will be created with the latest compiled code.
  3.  Remove /tmp/sddc-support/utils/progressreporter.py  by running  rm /tmp/sddc-support/utils/progressreporter.py
  4. If the SOS log completed successfully,  then  run mv /opt/vmware/sddc-support /opt/vmware/sddc-support-golden-copy to preserve the copy of code which came with build.
  5. Run  mv /tmp/sddc-support /opt/vmware   to move the SDDC-Support from the tmp directory to its original location, 


Attachments

progressreporter.py get_app