Do not forward Envoy MetricsMgr and GET API success response (status 2xx) logs to the remote syslog server

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

Large number of syslog events are sent from SSP to the syslog server, 10k+ MetricsMgr envoy logs over 5 minutes.

Environment

SSP 5.1.0

Cause

Envoy logs contain too many MetricsMgr logs which get forwarded to syslog server. All the GET API audit/envoy logs are also forwarded to the remote syslog server.

Resolution

Workaround:

===========

Update ConfigMaps:

Log in to the SSPI CLI using sysadmin credentials.
Add or update the required filters and tags in the fluent-bit and fluentd ConfigMaps.

From SSPI, add/update following filters/tags in fluent-bit and fluentd configmaps:

- Take backup of current fluent-bit configmap, save current configmap with below command:

k -n nsxi-platform get cm nsxi-platform-fluent-bit -o yaml > nsxi-platform-fluent-bit_original.yaml

- Update the fluent-bit configmap using command:

k -n nsxi-platform edit cm nsxi-platform-fluent-bit

apiVersion: v1
data:
  fluent-bit.conf: |
...
...
    [FILTER]
        Name             kubernetes
        Match            kube.*
        Kube_Tag_Prefix  kube.var.log.containers.
        Buffer_Size      15MB
        Merge_Log        On
        Merge_Log_Key    log_processed
        Annotations      Off
        Labels           Off

# ------------------------------------------ Add rules in following rewrite_tag filter ---------------------------------------------------

    [FILTER]
        Name             rewrite_tag
        Match            kube.*
        Rule             $log ^(?=.*audit="true")(?=.*MetricsMgr)(?=.*status=2\d{2}).*$ kube_skip_syslog false
        Rule             $log ^(?=.*audit="true")(?=.*method=GET)(?=.*status=2\d{2}).*$ kube_skip_syslog false
        Rule             $log ^(?=.*file\s\/admin\/ok\sdoes\snot\sexist).*$ kube_skip_syslog false
        Rule             $log ^(?=.*audit="true").*$ audit_logs true

    [OUTPUT]
        Name forward
        Match *
        Host fluentd-aggregator
        Port 24224
        Retry_Limit False
        tls On
        tls.verify Off
        tls.ca_file /fluent-bit/ssl/fluent-bit-ca.crt
        tls.crt_file /fluent-bit/ssl/fluent-bit-tls.crt
        tls.key_file /fluent-bit/ssl/fluent-bit-tls.key
  parsers.conf: |

- Take backup of fluentd configmap, save current configmap with below command:

k -n nsxi-platform get cm fluentd-aggregator-cm -o yaml > fluentd-aggregator-cm_original.yaml

- Update the fluentd configmap using command:

k -n nsxi-platform edit cm fluentd-aggregator-cm

apiVersion: v1
data:
  fluentd-inputs.conf: |
...
...
  fluentd-output.conf: |
...
...
    # Throw the recommendation-clean-up-cronjob logs
    <match **recommendation-clean-up-cronjob**>
      @type null
    </match>

 # ------------------------------------------ Add following match block ---------------------------------------------------

    # MetricsMgr and GET envoy logs with status=2xx - present in pod log only, not in audit_log.log or syslog server
    <match kube_skip_syslog>
      @type file
      path /opt/bitnami/fluentd/logs/buffers/${$.kubernetes.namespace_name}/${$.kubernetes.host}/${$.kubernetes.pod_name}
      append true
      <format>
        @type single_value
        message_key log
      </format>
      <buffer $.kubernetes.host,$.kubernetes.pod_name,$.kubernetes.namespace_name>
        @type file
          path /opt/bitnami/fluentd/logs/buffers/buffers-temp/general_buffer_skip_syslog
          flush_thread_count 8
          flush_mode interval
          flush_interval 5s
          timekey 1d
          timekey_wait 10m
          chunk_limit_size 16MB
          total_limit_size 512MB
          overflow_action drop_oldest_chunk
      </buffer>
    </match>
...
...

2. Restart Fluentd: k -n nsxi-platform rollout restart statefulset fluentd

Verify the pod status by checking that fluentd-0 is up and running after the restart:
k -n nsxi-platform get pods | grep fluentd-0

3. Restart Fluent-bit: k -n nsxi-platform rollout restart daemonset nsxi-platform-fluent-bit

Verify the pod status by checking that the nsxi-platform-fluent-bit pod/pods up and running after the restart
k -n nsxi-platform get pods | grep nsxi-platform-fluent-bit

4. Validate Logs on Remote Syslog Server

Once both fluentd and fluent-bit pods are healthy, check the logs on the remote syslog server.
Confirm that logs are being forwarded correctly.
Specifically verify envoy MetricsMgr logs and GET 2XX/success response logs will not be forwarded.

Additional Information

This issue will be fixed in further releases