When listing out files we see something similar to:
box:/$ test ls -lart
total 4904
drwxr-xr-x 5 root root 123 Mar 20 16:33 .
drwxr-xr-x 3 root root 16 Mar 20 17:15 ..
?????????? ? ? ? ? ? XXXXXXXX
drwxr-xr-x 3 root root 57 Mar 20 16:58 XXXXXXXXXXX
?????????? ? ? ? ? ? XXXXXXXXXXXXXXXXXXXXXXXXX
When looking at dmesg.log on the nodes we see similar messages:
dmesg on the nodes show:
[6861390.799056] CIFS: VFS: Autodisabling the use of server inode numbers on \\XX.XXX.XX.XXX\File_location
[6861390.799062] CIFS: VFS: The server doesn't seem to support them properly or the files might be on different servers (DFS)
[6861390.799063] CIFS: VFS: Hardlinks will not be recognized on this mount. Consider mounting with the "noserverino" option to silence this message.
[6862008.903020] CIFS: VFS: \\XX.XXX.XX.XXX No task to wake, unknown frame received! NumMids 3
[6870588.588497] CIFS: VFS: \\XX.XXX.XX.XXX No task to wake, unknown frame received! NumMids 2
This happens intermittently
Using a SMB server hosted on Windows with vSphere Kubernetes Service guest clusters
This is an SMB metadata consistency issue caused by how Linux handles leasing and inode instability, exposed by newer Linux kernels and the CSI SMB driver
On all affected clusters we disable Leasing and Oplocks using the below commands on the SMB server:
Set-SmbServerConfiguration -EnableLeasing $false
Set-SmbServerConfiguration -EnableOplocks $false
Please talk to Microsoft for more assistance.