Fluent bit pod in TKGI namespace pks-system is in CrashLoopBackOff state at one or more worker nodes with error log similar to:
[error] [/tmp/fluent-bit-xyz/plugins/in_tail/tail_fs_inotify.c:360 errno=24] Too many open files
[error] failed initialize input tail.0
[error] [engine] input initialization failed
[error] [lib] backend failed
Tanzu Kubernetes Grid Integrated Edition
Fluent Bit pod crashes because it exceeds the worker node inotify/open-file limits while tailing cluster's log files.
Raise the inotify and file-handle sysctl on fluent-bit pod OS.
In order to make a permanent configuration change, create a runtime config which will be applied to bosh.
vim runtime-config.yaml
-----
releases:
- name: "os-conf"
version: "23.0.0"
addons:
- name: fluent-bit-os-max-config
jobs:
- name: sysctl
release: os-conf
properties:
sysctl:
- fs.inotify.max_user_watches=524288
- fs.inotify.max_user_instances=16383
- fs.inotify.max_queued_events=524288
- fs.file-max=2097152
include:
deployments: [service-instance_xxx-yyy]# Optional, you can define which deployments (TKGi clusters) this runtime config will be applied to.
instance_groups: [<master and/or worker, as defined in the deployment manifest>] # Optional, you can define which instance_groups (cluster nodes, i.e. masters/workers) this runtime config will be applied to.
exclude:
deployments: [<service-instance_XXXXXXXXXX>] # Optional, you can define which deployments (TKGi clusters) this runtime config will not be applied to.
instance_groups: [<master and/or worker, as defined in the deployment manifest>] # Optional, you can define which instance_groups (cluster nodes, i.e. masters/workers) this runtime config will not be applied to.
# Update bosh configs:
bosh update-config --type=runtime --name fluent-bit-os-max runtime-config.yaml
# Get the service instance manifest, where the fluentbit pod was having issues and re-deploy it:
bosh -d service-instance_xxx-yyy-zzz manifest > service-instance_xxx-yyy-zzz.yaml
bosh -d service-instance_xxx-yyy-zzz deploy service-instance_xxx-yyy-zzz.yaml