Patching 8.0u3 24022515 to 8.0u3e fails with: "Exception occurred in postInstallHook"
search cancel

Patching 8.0u3 24022515 to 8.0u3e fails with: "Exception occurred in postInstallHook"

book

Article ID: 405183

calendar_today

Updated On:

Products

VMware vCenter Server VMware vCenter Server 8.0

Issue/Introduction

  • While patching vCenter Server from version 8.0u3 build 24022515 to 8.0u3e, the patch process fails at 80% with the following error:

    Exception occurred in postInstallHook for B2B-pathing. Please check the logs for more details. Take corrective action and then resume

  • Log analysis from /var/log/vmware/applmgmt/Patchrunner.log shows:
    YYYY-MM-DDThh:mm:ssZ  ERROR vmware_b2b.patching.phases.patcher Patch hook Patch got unhandled exception.
    Traceback (most recent call last):
      File "/storage/seat/software-update_5fygr8x/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 208, in patch
        _patchComponents(ctx, userData, statusAggregator.reportingQueue)
      File "/storage/seat/software-update_5fygr8x/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 89, in _patchComponents
        _startDependentServices(c)
      File "/storage/seat/software-update_5fygr8x/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 56, in _startDependentServices
        serviceManager.start(depService)
      File "/storage/seat/software-update_5fygr8x/stage/scripts/patches/libs/sdk/service_manager.py", line 909, in wrapper
        return getattr(controller, attr)(*args, **kwargs)
      File "/storage/seat/software-update_5fygr8x/stage/scripts/patches/libs/sdk/service_manager.py", line 799, in start
        super(VMwareServiceController, self).start(serviceName)
      File "/storage/seat/software-update_5fygr8x/stage/scripts/patches/libs/sdk/service_manager.py", line 665, in start
        raise IllegalServiceOperation(errorText)
    service_manager.IllegalServiceOperation: Service cannot be started. Error: Error executing start on service vsan-health. Details {
        "detail": [
            {
                "id": "install.ciscommon.service.failstart",
                "translatable": "An error occurred while starting service '%(0)s'",
                "args": [
                    "vsan-health"
                ],
                "localized": "An error occurred while starting service 'vsan-health'"
            }
        ],
        "componentKey": null,
        "problemId": null,
        "resolution": null
    }
    Service-control failed. Error: {
        "detail": [
            {
                "id": "install.ciscommon.service.failstart",
                "translatable": "An error occurred while starting service '%(0)s'",
                "args": [
                    "vsan-health"
                ],
                "localized": "An error occurred while starting service 'vsan-health'"
            }
        ],
        "componentKey": null,
        "problemId": null,
        "resolution": null
    }

    YYYY-MM-DDThh:mm:ssZ  WARNING root stopping status aggregation...
    YYYY-MM-DDThh:mm:ssZ  ERROR __main__ Patch vCSA failed
  • Corresponding /var/log/vmware/vsan-health/vmware-vsan-health-service.log also shows:

    Traceback (most recent call last):
      File "bora/vsan/common/VsanScheduler.py", line 111, in Run
      File "bora/vsan/clustermgmt/vpxd/VsanClusterPrototypeImpl.py", line 5017, in ReconcileDatastoreName
      File "bora/vsan/vsanvp/vpxd/pyMoVsan/VsanVpUtil.py", line 24, in GetClusterFromContainerId
    ImportError: cannot import name 'GetClusterMoId' from '_VsanMgmtServer' (unknown location)
    YYYY-MM-DDThh:mm:ssZ INFO vsan-mgmt[260913] [VsanScheduler::_ThreadMain opID=vsan-6######8bde-W8] Job done
    YYYY-MM-DDThh:mm:ssZ INFO vsan-mgmt[261002] [VsanScheduler::ScheduleWorkItem opID=vsan-PC-63933f73c8bde] Work entities length: 5
    YYYY-MM-DDThh:mm:ssZ INFO vsan-mgmt[260712] [VsanVcModuleImporter::startImport opID=noOpId] Importing VSAN extension VsanVcStretchedCluster__ext_init__
    YYYY-MM-DDThh:mm:ssZ INFO vsan-mgmt[260916] [VsanScheduler::_ThreadMain opID=vsan-PC-6######8bde-W8] Executing itemListHead: datastore-###-ReconcileDatastoreName: func: ReconcileDatastoreName, {'conn': <VsanManagementVcConnection.VsanManagementVcConnection object at 0x7fa73e1f2110>, 'db': <VsanClusterPrototypeImpl.PersistenceHelper object at #######0>, 'datastore': 'vim.Datastore:datastore-###'}, {}
    YYYY-MM-DDThh:mm:ssZ ERROR vsan-mgmt[260914] [VsanScheduler::_ThreadMain opID=vsan-PC-63933f73c8bde-W6] Workitem 6 failed
    Traceback (most recent call last):
      File "bora/vsan/common/VsanScheduler.py", line 357, in _ThreadMain
      File "bora/vsan/common/VsanScheduler.py", line 111, in Run
      File "bora/vsan/clustermgmt/vpxd/VsanClusterPrototypeImpl.py", line 5017, in ReconcileDatastoreName
      File "bora/vsan/vsanvp/vpxd/pyMoVsan/VsanVpUtil.py", line 24, in GetClusterFromContainerId
    ImportError: cannot import name 'GetClusterMoId' from '_VsanMgmtServer' (unknown location)

Environment

vCenter server 8.X

 

Cause

The issue is triggered due to the presence of invalid entries in the cns.vpx_storage_volume_update table within the VCDB. These entries include volume_id values prefixed with file:, which are not expected in this table. 

During boot-up, CNS attempts to populate its in-memory cache using database contents. However, it encounters a null pointer error because it expects block volume-specific data but receives file volume information instead.

The table "cns.vpx_storage_volume_update" is not intended to store file volumes. While these entries don't cause immediate issues, problems may arise during the next vsan-health reboot and potentially during an upgrade

Example of problematic entries:

select * from cns.vpx_storage_volume_update;

                 volume_id                 |                         datastore                          | vclock | modified | deleted | corrupted 
-------------------------------------------+------------------------------------------------------------+--------+----------+---------+-----------
 file:79de7b37-####-4ae7-####-69a####0938f | ds:///vmfs/volumes/vsan:52c#####b3b4#3-c85####4bbd###/ |    723 | f        | t       | f
 file:0f66d117-####-4061-####-5fab##30bd1# | ds:///vmfs/volumes/vsan:52c#####b3b4#3-c85####4bbd###/ |    723 | f        | t       | f
 file:9480f2da-####-48a2-####-e8df22c97873 | ds:///vmfs/volumes/vsan:52c#####b3b4#3-c85####4bbd###/ |    723 | f        | t       | f
 013bdbba-####-424c-####-5b####573c29      | ds:///vmfs/volumes/vsan:52c#####b3b4#3-c85####4bbd###/ |    723 | t        | f       | t
 2306eea1-####-4bbe-####-adb2####3f36      | ds:///vmfs/volumes/vsan:52c#####b3b4#3-c85####4bbd###/ |    723 | t        | f       | t
 6228ee2a-####-4151-####-e9f8d###60ae      | ds:///vmfs/volumes/vsan:52c#####b3b4#3-c85####4bbd###/ |    723 | t        | f       | t
 b4e48852-####-46e8-####-dcf08e0fea14      | ds:///vmfs/volumes/vsan:52c#####b3b4#3-c85####4bbd###/ |    723 | t        | f       | t
 file:0a8afb23-####-499a-####-69a####0938f | ds:///vmfs/volumes/vsan:5289####d528b##-659####fbe4####/ | 140669 | f        | t       | f
 file:40e20911-####-40b9-####-c1#####72e13 | ds:///vmfs/volumes/vsan:5289####d528b##-659####fbe4####/ | 140669 | f        | t       | f
 file:1258fb5e-####-4c57-####-72b####70208 | ds:///vmfs/volumes/vsan:5289####d528b##-659####fbe4####/ | 140669 | f        | t       | f
 file:3334cc86-####-4a1d-####-e01#####f807 | ds:///vmfs/volumes/vsan:5289####d528b##-659####fbe4####/ | 140669 | f        | t       | f
 file:b6a6f4ac-####-4504-####-7bdf####0454 | ds:///vmfs/volumes/vsan:5289####d528b##-659####fbe4####/ | 140669 | f        | t       | f
 file:175d751a-####-429b-####-0ad####facda | ds:///vmfs/volumes/vsan:5289####d528b##-659####fbe4####/ | 140669 | f        | t       | f
 file:bf669c95-####-4c0a-####-e4####d7f0e5 | ds:///vmfs/volumes/vsan:5289####d528b##-659####fbe4####/ | 140669 | f        | t       | f

Resolution

NOTE:-Take snaphot of the vCenter server before making any changes in database. If the vCenter server are in linked mode, then power of all the vCenter server in ELM and then take snapshot.

  • Stop  vpxd and content-library services:

    service-control --stop vpxd
    service-control --stop content-library

  • Connect to the vCenter database:

    /opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres

  • Review problematic entries:

    select * from cns.vpx_storage_volume_update;
  • If the table has entries with file, the delete the entries

    DELETE FROM cns.vpx_storage_volume_update WHERE volume_id LIKE 'file:%';
  • Start the services back:

    service-control --start vpxd
    service-control --start content-library

  • Retry the patching process.