TKGi - ClusterLogSink sending to Splunk HEC endpoint with TLS - TLS error in fluentbit pod logs
search cancel

TKGi - ClusterLogSink sending to Splunk HEC endpoint with TLS - TLS error in fluentbit pod logs

book

Article ID: 383838

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

The configured ClusterLogSink to send logs to Splunk HEC endpoint over TLS has been set but you note there are TLS errors in the fluentbit pods as per below:

[2024/xx/xx xx:xx:xx] [error] [tls] error: unexpected EOF

After checking the fluent_bit_ca_cert it was noted that the md5sum was not correct.

This guide is to aid with troubleshooting the md5sum of the fluent_bit_ca_cert to ensure the correct cert is being used in the ClusterLogSink.

Environment

TKGi 1.19.2
Fluentbit 2.2.3
Splunk 9.2.2

Cause

There are cert errors visible in the fluentbit logs, as per advised previously, there will be multiple entries of this TLS error: unexpected EOF.

The cert in the ClusterLogSink after checking its md5sum is noted as being incorrect from the original cert however, the certficate looks identical to the original.

spec:
  type: splunk
  fluent_bit_ca_cert: |
    -----BEGIN CERTIFICATE-----
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    …
    ...
    XXXXXXXXXXXXXXXXX
    -----END CERTIFICATE-----

  output_properties:
    Host: test.broadcom.com
    Port: XXXX
    Splunk_Token: XXXXXXXXXX
    tls: "On"
    tls.verify: "true"
    tls.ca_file: /etc/fluent-bit-ca/<file-name>

Resolution

Validate the md5sum of the clusterlogsink for splunk's fluent_bit_ca_cert as per below command:

kubectl get clusterlogsink splunk -o jsonpath='{.spec.fluent_bit_ca_cert}' | md5sum


Next check the md5sum fluentbit cacert from the original cacert pem file itself:

cat filename.pem | tr -d '\r' | md5sum


If the fluentbit cacert md5sum of the clusterlogsink and the fluentbit cacert pem file differ then the clusterlogsink will have to be updated.


Edit the ClusterLogSink and update it with the cacert with below command:

kubectl edit clusterlogsink splunk


Next update the fluent_bit_ca_cert with the correct cert as per below:

spec:
  type: splunk
  fluent_bit_ca_cert: |
    -----BEGIN CERTIFICATE-----
    YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
    …
    ...
    YYYYYYYYYYYYYYYYY
    -----END CERTIFICATE-----

  output_properties:
    Host: test.broadcom.com
    Port: XXXX
    Splunk_Token: XXXXXXXXXX
    tls: "On"
    tls.verify: "true"
    tls.ca_file: /etc/fluent-bit-ca/<file-name>

-- note the ca cert section has been changed 

After this check the fluentbit pod logs the errors will have cleared and after checking splunk you will see the ClusterLogSinks will have been sent to Splunk.