Unable to attach PV with error "ServerFaultCode: NotAuthenticated" and pods are stuck in ContainerCreating
search cancel

Unable to attach PV with error "ServerFaultCode: NotAuthenticated" and pods are stuck in ContainerCreating

book

Article ID: 368743

calendar_today

Updated On:

Products

vSphere with Tanzu

Issue/Introduction

  • Unable to attach PV getting error "ServerFaultCode: NotAuthenticated"

  • Pods trying to attach PV are getting stuck in "ContainerCreating"

  • VPXD logs on vCenter Server will show below errors - /var/log/vmware/vpxd/vpxd.log 

info vpxd[08547] [Originator@6876 sub=VmCustomizer opID=vmoperator-xx-cluster-workers-<opID>] hostVersion = 7.0.3, Tools version = 11360
warning vpxd[08547] [Originator@6876 sub=vmomi.soapStub[108135] opID=vmoperator-xx-cluster-workers-<opID> SOAP request returned HTTP failure; <SSL(<io_obj p:2a78, h:145, <TCP '<IP_address> : 48106'>, <TCP '<IP_address> : 443'>>), /sdk>, method: systemManagement; code: 500(Internal Server Error)
info vpxd[08547] [Originator@6876 sub=Vmomi opID=vmoperator-xx-cluster-workers-<opID>] Retry SOAP call after exception; <<last binding: <<TCP '<IP_address>: 55394'>, <TCP
'<IP_address>: 443'>>>, /sdk>, vim.NfcService.systemManagement, N3Vim5Fault16NotAuthenticated9ExceptionE(Fault cause: vim.fault.NotAuthenticated
--> )
--> [context]------[/context]
info vpxd[08547] [Originator@6876 sub=Vmomi opID=vmoperator-xx-cluster-workers-<opID>] Stale SOAP session to host <ESXi_FQDN>; reinitializing
info vpxd[08547] [Originator@6876 sub=Vmomi opID=vmoperator-xx-cluster-workers-<opID> Creating SOAP stub adapter for /sdk on <ESXi_FQDN>:443

  • CSI Syncer logs on Supervisor Cluster will show below errors /var/log/pods/vmware-system-csi_vsphere-csi-controller-<ID>/vsphere-syncer1.log

stderr F {"level":"info","time":"<Date/Time>","caller":"syncer/metadatasyncer.go:427","msg":"CSI full sync failed with error: ServerFaultCode: NotAuthenticated","TraceId":"<UUID>"}
stderr F I0412 05:33:18.486950       1 request.go:668] Waited for 1.001530599s due to client-side throttling, not priority and fairness, request: GET:https://127.0.0.1:6443/api/v1/namespaces/devk8s/persistentvolumeclaims/<UUID>
stderr F {"level":"info","time":"<Date/Time>","caller":"cnsnodevmattachment/cnsnodevmattachment_controller.go:209","msg":"Reconciling CnsNodeVmAttachment with Request.Name: \"<guest_cluster_worker_node_name>-containerd\" instance \"<guest_cluster_worker_node_name>-containerd\" timeout \"1s\" seconds","TraceId":"<UUID>"}
stderr F {"level":"info","time":"<Date/Time>","caller":"cnsnodevmattachment/cnsnodevmattachment_controller.go:335","msg":"vSphere CSI driver is attaching volume: \"<UUID>\" to nodevm: VirtualMachine:vm-ID [VirtualCenterHost: <ESXi_FQDN>, UUID: <UUID>, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-<ID>, VirtualCenterHost: <ESXi_FQDN>]] for CnsNodeVmAttachment request with name: \"<guest_cluster_worker_node_name>-containerd\" on namespace: \"stgk8s\"","TraceId":"<UUID>"}

Environment

vSphere with Tanzu 8.x
vSphere with Tanzu 7.x

 

Resolution

  • This issue is resolved in CSI driver in v2.7.2 version which is available in vCenter Server 8.0U2 or later version. Upgrade the Supervisor version post upgrading the VC to 8.0U2.

  • For vCenter Server 7.x there is no fix, please follow below workaround:

    Workaround:

    • Restart the csi controller pod using below command:

kubectl rollout restart deployment vsphere-csi-controller -n vmware-system-csi

Note: This step will help to recover the CSI controller temporarily, you will have to restart the CSI controller deployment when you face this issue.