There is a bug in TKGi which is fixed in v1.16.5 that occasionally, a cluster might fail to send the cluster_name tag to logging after being upgraded.
After upgrading a cluster, the Name record_modifier filter will occasionally be missing from the cluster’s fluent-bit ConfigMap, and the cluster_name is not included in log entries. This problem occurs if the sink-controller process configures the cluster before the observability-manager starts, which overwrites the desired configuration.
It is documented in the release notes, but customer is interested in the detailed explanation and workaround of this issue before product upgrade.
This is a bug due to the “out of order” when restarting observability-manager and sink-controller in the process of updating Kubernetes clusters to update compute-profile. Observability-manager applies fluent-bit configmap without “record-modifier”, then sink-controller adds “record-modifier” once starts. So sink-controller should start after Observability-manager. But sink-controller might restart and modify fluent-bit configmap before Observability-manager restarts, so “record-modifier” is missing because Observability-manager overwrites it.
This will also happen when observability-manager is manually restarted, or probably happen when reducing worker count.
The workaround is to restart sink-controller to patch "record-modifier" by executing "kubectl delete pod <sink-controller-pod-name> -n pks-system". A new sink-controller will be created automatically soon.