Error: "Connection reset by peer" and "mem buf overlimit" in Fluent Bit when sending logs to Aria Operations for Logs
search cancel

Error: "Connection reset by peer" and "mem buf overlimit" in Fluent Bit when sending logs to Aria Operations for Logs

book

Article ID: 432579

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service VMware Tanzu for Kubernetes Operations Tanzu Kubernetes Runtime

Issue/Introduction

Fluent Bit is unable to forward logs to VMware Aria Operations for Logs (formerly vRealize Log Insight) using the syslog output plugin.

The Fluent Bit pod logs display continuous errors indicating connection resets, timeouts, and memory buffer limits:

[warn] [input] tail.0 paused (mem buf overlimit)
[error] [/tmp/src/plugins/out_syslog/syslog.c:870 errno=104] Connection reset by peer
[error] [engine] failed to flush chunk '1-123456.123456.flb', retry in 11 seconds: task_id=4, input=tail.0 > output=syslog.1 (out_id=1)
[error] [engine] chunk '1-1123456.123456.flb' cannot be retried: task_id=10, input=tail.0 > output=syslog.1
[warn] [input] tail.0 resume (mem buf overlimit)
[info] [input] pausing tail.0
[error] [/tmp/src/plugins/out_syslog/syslog.c:870 errno=110] Connection timed out

Testing network connectivity using nc from the Fluent Bit pod to the configured Aria Operations for Logs port succeeds, confirming the basic Layer 4 network path is open.

Environment

VMware Tanzu Kubernetes Grid (TKG) / vSphere with Tanzu
Fluent Bit

Cause

The mem buf overlimit condition occurs when Fluent Bit reads logs from containers faster than it can transmit them over the network. Although basic TCP connectivity succeeds, ingestion backpressure, concurrent connection limits, or load balancer timeouts on the destination node cause active connection resets (errno=104) and timeouts (errno=110) during payload transmission. The unsent logs are held in memory until the configured Mem_Buf_Limit is reached, forcing Fluent Bit to pause log ingestion to prevent an Out-of-Memory (OOM) crash.

Resolution

  1. Inspect the Fluent Bit ConfigMap to identify the configured destination Host (Aria Operations for Logs VIP or Node IP) and Port.

kubectl get configmap <fluent-bit-config-name> -n <namespace> -o yaml

Example Output Block:

YAML
 
 outputs: |
      [OUTPUT]
        Name          <connection type>
        Match         *
        Host          <Destination_IP> 
        Port          <Destination_Port>
        URI           api/v2/events
        Format        json
        tls           <on/off>
        tls.debug     4
        tls.verify    off
        json_date_key timestamp

Note: The Host parameter must point to the destination Aria Operations for Logs IP, not the local Fluent Bit IP.

  1. Test Layer 4 network connectivity from the Fluent Bit pod to the destination IP and port identified in the ConfigMap.

kubectl exec -it <fluent-bit-pod-name> -n <namespace> -- nc -vz <Destination_IP> <Destination_Port>

Note: If the nc test succeeds (reports "open" or "connected"), the network path is clear, confirming the connection resets are occurring at the destination application layer.

  1. Identify the Fluent Bit daemonset and namespace.

kubectl get daemonset -A | grep fluent
  1. Perform a rollout restart of the Fluent Bit daemonset. This clears the stuck memory buffers, drops the backlogged unsent chunks, and re-establishes fresh TCP connections to the destination endpoint.

kubectl rollout restart daemonset <fluent-bit-daemonset-name> -n <namespace>
  1. Verify the pods have restarted successfully and monitor the logs to confirm the memory buffer and connection errors have ceased.

kubectl logs -l app=fluent-bit -n <namespace> -f

Additional Information

A rollout restart is a temporary mitigation. To prevent recurrence during high load or network blips, it is recommended to transition the Fluent Bit outputs.conf from the TCP syslog plugin to the http plugin (VMware ingestion API/CFAPI), which provides superior connection resilience and backpressure handling. Additionally, configuring storage.type filesystem enables disk buffering to prevent memory buffer overlimits.