No space left on device
kubectl logs <fluent-bit-pod> -n pks-system, you see the entries similar to:[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events
[2020/03/04 20:16:17] [error] [plugins/in_tail/tail_fs.c:219 errno=28] No space left on device
[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events
[2020/03/04 20:16:17] [error] [plugins/in_tail/tail_fs.c:219 errno=28] No space left on device
[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events
ENOSPC The user limit on the total number of inotify watches was reached
or the kernel failed to allocate a needed resource.
Impact:
If this situation occurs, the underlying log files are actually not lost or deleted.
They are still there. However, they will no longer be monitored by Fluent-bit after hitting that current limit.
This situation and error occur because (at that time) the system kernel has reached the limit of filesystem "inodes" (not a limit of storage space).
VMware TKGI
This is expected behavior with Fluent-bit if there are not enough file descriptors and the kernel parameter fs.inotify.max_user_watches is not currently sufficient for the capacity of the cluster workloads.
These resources have to be managed by the cluster Administrator, and increased appropriately, and may be dependent upon existing cluster resources (number of nodes, etc) and the dynamic nature of workloads running within the cluster.
As a work around, you can increase the sysctl parameter fs.inotify.max_user_watches to 16384 to start with and see if this resolves the issue.
You have two(2) options for modifying the systctl parameter:
sysctl parameter fs.inotify.max_user_watches on ALL current worker node VMs: IMPORTANT: This workaround will not persist across TKGI upgrades or node recreation.
For more information, see https://github.com/fluent/fluent-bit/issues/1018
sysctl -a | grep fs.inotify.max_user_watches
sysctl -w fs.inotify.max_user_watches=16384
sysctl -p
sysctl -a | grep fs.inotify.max_user_watches
/etc/sysctl file:/etc/sysctl as the root userfs.inotify.max_user_watches parameter16384 /etc/sysctl file: sysctl -p
For more information on the Fluent-bit Open Source issue. Refer to: