Unable to Enable TSDM Configuration on VM: vCenter shows Tasks under Monitor Virtual Machine properties for Enable LWD at 10%
search cancel

Unable to Enable TSDM Configuration on VM: vCenter shows Tasks under Monitor Virtual Machine properties for Enable LWD at 10%

book

Article ID: 386535

calendar_today

Updated On:

Products

VMware vCenter Server 7.0 VMware vCenter Server 8.0 VMware vSphere ESXi 7.0 VMware vSphere ESXi 8.0

Issue/Introduction

Customer TSDM product : 

  • PowerProtect Data Manager version (PPDM) 19.9 and later

Vcenter logs :

LOG: /var/log/vmware/vsan-health/vsanvcmgmtd.log, we see the below:

Exception: Fault cause: vmodl.fault.InvalidArgument 
Type: N5Vmomi5Fault15InvalidArgument9ExceptionE

date> info vsanvcmgmtd[10838] [vSAN@6876 sub=AdapterServer opId=0ace37b7] Invoking 'setProtection' on 'data-protection-service' session '5237XXXX-XXXX-xxxx-xxxx-xxxxxxxxxxxx' active <date> info vsanvcmgmtd[10838] [vSAN@6876 sub=SessionOrientedStub[4] opId=0ace37b7] Recovered session, sid: 23, recoverRequestOnly:false
<date> info vsanvcmgmtd[10838] [vSAN@6876 sub=dps opId=0ace37b7] ProtectionServiceImpl::SetProtection, spec: (dp.vpx.ProtectOpSpec) {
-->    vendorId = "Dell EMC PowerProtect",
-->    state = true,
-->    pe = (dp.vpx.VirtualMachineProtectedEntity) {
-->       vmRef = 'vim.VirtualMachine:vm-51067',
-->       includedDiskKeys = (int) [
-->          2002,
-->          2000,
-->          2001
-->       ]
-->    }
--> }
<date> info vsanvcmgmtd[10838] [vSAN@6876 sub=dps opId=0ace37b7] DPS 'com.vmware.dp.tasks.protect' task 'task-811900' created
<date> info vsanvcmgmtd[10838] [vSAN@6876 sub=VsanTaskSvc opId=0ace37b7] ADD public task 'task-811900', total: <date> warning vsanvcmgmtd[09956] [vSAN@6876 sub=vmomi.soapStub[14] opId=0ace37b7] SOAP request returned HTTP failure; <SSL(<io_obj p:0x0000558fb2306d68, h:96, <TCP '127.0.0.1 : 38294'>, <TCP '127.0.0.1 : 80'>>), /sdk>, method: QueryEvent; code: 500(Internal Server Error)
<date> panic vsanvcmgmtd[09956] [vSAN@6876 sub=ThreadPool opId=0ace37b7] Worker: Unhandled exception: N5Vmomi5Fault15InvalidArgument9ExceptionE(Fault cause: vmodl.fault.InvalidArgument
--> )
--> [context]zKq7AVECAQAAAE65RwEVdnNhbnZjbWdtdGQAABDZN2xpYnZtYWNvcmUuc28AAKd4LAAlbC0ALekyAQ1LGmxpYnZtb21pLnNvAAFDIw4B7s0WAd7XFgHubRYC+0cEbGliUHlDcHBWbW9taS5zbwABGW4Sg1ukLwFsaWJ2aW0tdHlwZXMuc28ABKZvA2xpYkRhdGFQcm90ZWN0aW9uU3ZjLnNvAAQtEAQEoRIEBKVOBgD3SSMAT58jAM5kNwWHfwBsaWJwdGhyZWFkLnNvLjAABi82D2xpYmMuc28uNgA=[/context]
YYYY-MM-DDThh:mm:ss.775Z error vsanvcmgmtd[09956] [vSAN@6876 sub=Default opId=0ace37b7]
-->
--> Exception: Fault cause: vmodl.fault.InvalidArgument
-->
--> Type: N5Vmomi5Fault15InvalidArgument9ExceptionE

 

LOG:  /var/log/vmware/vpxd/vpxd.log


Error:  eventChainId -1285927609 (negative value), eventChainId has a negative value in the log


<date> warning vpxd[06963] [Originator@6876 sub=Vmomi opID=13500ae3] VMOMI activation LRO failed; <<528XXXX-XXXX-xxxx-xxxx-xxxxxxxxxxxx, <TCP '127.0.0.1 : 8089'>, <TCP '127.0.0.1 : 37114'>>, EventManager, vim.event.EventManager.QueryEvent>, N5Vmomi5Fault15InvalidArgument9ExceptionE(Fault cause: vmodl.fault.InvalidArgument
--> )
--> [context]zKq7AVECAQAAAEq5RwETdnB4ZAAAENk3bGlidm1hY29yZS5zbwAAp3gsACVsLQAt6TIBzWKHdnB4ZAABe0xmASPE0YJJHzIBbGlidmltLXR5cGVzLnNvAIHC1mEBgb3ZYAGB/tpgAYEs6mABgY8XYAGBWsFgAQCnSSMAT58jAM5kNwOHfwBsaWJwdGhyZWFkLnNvLjAABC82D2xpYmMuc28uNgA=[/context]
<date> info vpxd[06963] [Originator@6876 sub=vpxLro opID=13500ae3] [VpxLRO] -- FINISH lro-86645339
<date> info vpxd[06963] [Originator@6876 sub=Default opID=13500ae3] [VpxLRO] -- ERROR lro-86645339 -- EventManager -- vim.event.EventManager.QueryEvent: vmodl.fault.InvalidArgument:
--> Result:
--> (vmodl.fault.InvalidArgument) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = <unset>,
-->    invalidProperty = "filter"
-->    msg = ""
--> }
--> Args:
-->
--> Arg filter:
--> (vim.event.EventFilterSpec) {
-->    entity = (vim.event.EventFilterSpec.ByEntity) {
-->       entity = 'vim.VirtualMachine:vm-51067',
-->       recursion = "self"
-->    },
-->    time = (vim.event.EventFilterSpec.ByTime) null,
-->    userName = (vim.event.EventFilterSpec.ByUsername) null,
-->    eventChainId = -1285927609,
-->    alarm = <unset>,
-->    scheduledTask = <unset>,
-->    disableFullMessage = <unset>,
-->    category = <unset>,
-->    type = <unset>,
-->    tag = <unset>,
-->    eventTypeId = (string) [
-->       "vim.event.TaskEvent"
-->    ],
-->    maxCount = <unset>,
-->    order = <unset>
--> }


If you do not see a negative value, you will see this in vpxd logs 

Error:  InvalidArgument

<date> warning vpxd[07771] [Originator@6876 sub=Vmomi opID=42da609b] VMOMI activation LRO failed; <<52ccXXXX-XXXX-xxxx-xxxx-xxxxxxxxxxxx, <TCP '127.0.0.1 : 8089'>, <TCP '127.0.0.1 : 55026'>>, EventManager, vim.event.EventManager.QueryEvent>, N5Vmomi5Fault15InvalidArgument9ExceptionE(Fault cause: vmodl.fault.InvalidArgument
--> )
--> [context]zKq7AVECAQAAAEq5RwETdnB4ZAAAENk3bGlidm1hY29yZS5zbwAAp3gsACVsLQAt6TIBzWKHdnB4ZAABe0xmASPE0YJJHzIBbGlidmltLXR5cGVzLnNvAIHC1mEBgb3ZYAGB/tpgAYEs6mABgY8XYAGBWsFgAQCnSSMAT58jAM5kNwOHfwBsaWJwdGhyZWFkLnNvLjAABC82D2xpYmMuc28uNgA=[/context]

Environment

VMware vcenter 8.x

VMware vcenter 7.x

VMware ESXi 8.x

VMware ESXi 7.x

Cause

This occurs due to the sequence ID overflow issue and stale "com.vmware.dp" initiator task.

Verification  steps :

  • SSH login to vCenter as root.
  • Login to vpostgres: /opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres
  • Run this query to see stale tasks: select event_id,chain_id,event_type,create_time from vpxv_event_all where event_type like 'com.vmware.dp.events.protect' and chain_id < 0;
  • To verify negative eventChainId via MOB: https://[VC_FQDN]/mob/?moid=EventManager&doPath=latestEvent (login with an administrative account), in MOB output you will "key" value also in negative.

Resolution

Note: Take vCenter backup/snapshots before proceeding, linked vCenter requires offline snapshots, and vCenter downtime is required

  1. Take VCDB backup: https://knowledge.broadcom.com/external/article/316471/back-up-and-restore-vcenter-server-appli.html
  2. Stop vpxd : service-control --stop vpxd
  3. From this article https://knowledge.broadcom.com/external/article?legacyId=2110031, download the file "2110031_Postgres_task_event_stat_reset_event_sequence.sql"
  4. Move the file to the vCenter under /var/core/ via WinSCP
  5. SSH login to vCenter as root.
  6. Run this command to clear all stale task
    • For 7.0 and later:
      • /opt/vmware/vpostgres/current/bin/psql -U postgres -v TaskMaxAgeInDays=0 -v EventMaxAgeInDays=0 -v StatMaxAgeInDays=0 -d VCDB -t -q -f /var/core/2110031_Postgres_task_event_stat_reset_event_sequence.sql
  7. Restart services:
    • service-control --stop --all && service-control --start --all
  8. To verify negative eventChainId is cleared via MOB: https://[VC_FQDN]/mob/?moid=EventManager&doPath=latestEvent (login with an administrative account),

Additional Information

PowerProtect Data Manager(PPDM) uses VMware APIs exposed through vCenter to enable TSDM protection. As part of enabling TSDM protection, PowerProtect Data Manager calls to dpSetProtection() API in vCenter which in turn creates the "Enable LWD-based data protection task" in vCenter. Due to the outlined sequence ID overflow issue in vCenter, this specific API fails with "Invalid argument" error in vCenter. The "Invalid argument" error is due to eventChainId having a negative value due to the vCenter sequence ID overflow issue.