No space left on device
kubectl logs <fluent-bit-pod> -n pks-system
, you see the entries similar to:[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events
[2020/03/04 20:16:17] [error] [plugins/in_tail/tail_fs.c:219 errno=28] No space left on device
[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events
[2020/03/04 20:16:17] [error] [plugins/in_tail/tail_fs.c:219 errno=28] No space left on device
[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events
ENOSPC The user limit on the total number of inotify watches was reached
or the kernel failed to allocate a needed resource.
Impact:
If this situation occurs, the underlying log files are actually not lost or deleted.
They are still there. However, they will no longer be monitored by Fluent-bit after hitting that current limit.
This situation and error occur because (at that time) the system kernel has reached the limit of filesystem "inodes" (not a limit of storage space).
VMware TKGI
This is expected behavior with Fluent-bit if there are not enough file descriptors and the kernel parameter fs.inotify.max_user_watches
is not currently sufficient for the capacity of the cluster workloads.
These resources have to be managed by the cluster Administrator, and increased appropriately, and may be dependent upon existing cluster resources (number of nodes, etc) and the dynamic nature of workloads running within the cluster.
As a work around, you can increase the sysctl
parameter fs.inotify.max_user_watches
to 16384
to start with and see if this resolves the issue.
You have two(2) options for modifying the systctl
parameter:
sysctl
parameter fs.inotify.max_user_watches
on ALL current worker node VMs: IMPORTANT: This workaround will not persist across TKGI upgrades or node recreation.
For more information, see https://github.com/fluent/fluent-bit/issues/1018
sysctl -a | grep fs.inotify.max_user_watches
sysctl -w fs.inotify.max_user_watches=16384
sysctl -p
sysctl -a | grep fs.inotify.max_user_watches
/etc/sysctl
file:/etc/sysctl
as the root userfs.inotify.max_user_watches
parameter16384
/etc/sysctl
file: sysctl -p
For more information on the Fluent-bit Open Source issue. Refer to: