After upgrading to the latest version of Tanzu Kubernetes Grid Integrated Edition (TKGI), LogSink stops sending logs for specific pods.
When reviewing your environment, you see the settings on the pods are correct. The LogSink is configured to collect logs for the correct namespace, however, the logs for particular pods are not forwarded.
To confirm the exact output of the
fluent-bit daemon, perform the following steps have to stop log forwarding:
1. Make a backup of the existing config map:
kubectl get cm -n pks-system fluent-bit -o yaml > fluent-bit-cm.yaml
2. Edit config map of the
fluent-bit:
kubectl edit cm -n pks-system fluent-bit
Under section
outputs.conf, remove all outputs defined and leave only the following output:
outputs.conf: |2
[OUTPUT]
Name file
Match *logg*
File fluentbit_output.log
Path /tmp
Where the
*logg* should correspond to the name of your pod.
3. Perform
rollout restart on
fluent-bit:
kubectl rollout restart daemonset -n pks-system fluent-bit
4. Confirm the pod in question and the corresponding
fluent-bit worker nodes with this command:
kubectl get pod -A -owide
In the case there are too many pods, you might have to specify only namespaces instead of using
-A.
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test logger-this-is-a 1/1 Running 0 21h 172.37.6.2 3f818472-7599-4630-96e1-5027f9d0bfd1 <none> <none>
pks-system fluent-bit-5dlcs 2/2 Running 0 77m 172.37.5.8 3f818472-7599-4630-96e1-5027f9d0bfd1 <none> <none>
pks-system fluent-bit-m65w8 2/2 Running 0 77m 172.37.5.7 937f645a-a890-4d87-9469-7dd558af6e0d <none> <none>
pks-system fluent-bit-ndxkz 2/2 Running 0 77m 172.37.5.9 d98ec4c2-8216-4ba7-a654-fddcb88785c5 <none> <none>
----------------------------------------------------------------------------------------
kubectl exec -it fluent-bit-5dlcs -n pks-system -- bash
root@fluent-bit-5dlcs:/# tail -f tmp/fluentbit_output.log
5. Verify if metadata is visible in the output, the metadata looks similar to this:
"kubernetes":{"pod_name":"logger-this-is-a","namespace_name":"test","pod_id
Where
logger-this-is-a and
test are names of the pod and the namespace.
If this data is missing, you must reduce the size of metadata for
fluent-bit to behave correctly.