Large number of syslog events are sent from SSP to the syslog server, 10k+ MetricsMgr envoy logs over 5 minutes.
SSP 5.1.0
Envoy logs contain too many MetricsMgr logs which get forwarded to syslog server. All the GET API audit/envoy logs are also forwarded to the remote syslog server.
Workaround:
===========
Log in to the SSPI CLI using sysadmin credentials.
Add or update the required filters and tags in the fluent-bit and fluentd ConfigMaps.
From SSPI, add/update following filters/tags in fluent-bit and fluentd configmaps:
k -n nsxi-platform get cm nsxi-platform-fluent-bit -o yaml > nsxi-platform-fluent-bit_original.yaml
k -n nsxi-platform edit cm nsxi-platform-fluent-bit
apiVersion: v1
data:
fluent-bit.conf: |
...
...
[FILTER]
Name kubernetes
Match kube.*
Kube_Tag_Prefix kube.var.log.containers.
Buffer_Size 15MB
Merge_Log On
Merge_Log_Key log_processed
Annotations Off
Labels Off
# ------------------------------------------ Add rules in following rewrite_tag filter ---------------------------------------------------
[FILTER]
Name rewrite_tag
Match kube.*
Rule $log ^(?=.*audit="true")(?=.*MetricsMgr)(?=.*status=2\d{2}).*$ kube_skip_syslog false
Rule $log ^(?=.*audit="true")(?=.*method=GET)(?=.*status=2\d{2}).*$ kube_skip_syslog false
Rule $log ^(?=.*file\s\/admin\/ok\sdoes\snot\sexist).*$ kube_skip_syslog false
Rule $log ^(?=.*audit="true").*$ audit_logs true
[OUTPUT]
Name forward
Match *
Host fluentd-aggregator
Port 24224
Retry_Limit False
tls On
tls.verify Off
tls.ca_file /fluent-bit/ssl/fluent-bit-ca.crt
tls.crt_file /fluent-bit/ssl/fluent-bit-tls.crt
tls.key_file /fluent-bit/ssl/fluent-bit-tls.key
parsers.conf: |
k -n nsxi-platform get cm fluentd-aggregator-cm -o yaml > fluentd-aggregator-cm_original.yaml
k -n nsxi-platform edit cm fluentd-aggregator-cm
apiVersion: v1
data:
fluentd-inputs.conf: |
...
...
fluentd-output.conf: |
...
...
# Throw the recommendation-clean-up-cronjob logs
<match **recommendation-clean-up-cronjob**>
@type null
</match>
# ------------------------------------------ Add following match block ---------------------------------------------------
# MetricsMgr and GET envoy logs with status=2xx - present in pod log only, not in audit_log.log or syslog server
<match kube_skip_syslog>
@type file
path /opt/bitnami/fluentd/logs/buffers/${$.kubernetes.namespace_name}/${$.kubernetes.host}/${$.kubernetes.pod_name}
append true
<format>
@type single_value
message_key log
</format>
<buffer $.kubernetes.host,$.kubernetes.pod_name,$.kubernetes.namespace_name>
@type file
path /opt/bitnami/fluentd/logs/buffers/buffers-temp/general_buffer_skip_syslog
flush_thread_count 8
flush_mode interval
flush_interval 5s
timekey 1d
timekey_wait 10m
chunk_limit_size 16MB
total_limit_size 512MB
overflow_action drop_oldest_chunk
</buffer>
</match>
...
...
2. Restart Fluentd: k -n nsxi-platform rollout restart statefulset fluentd
Verify the pod status by checking that fluentd-0 is up and running after the restart:
k -n nsxi-platform get pods | grep fluentd-0
3. Restart Fluent-bit: k -n nsxi-platform rollout restart daemonset nsxi-platform-fluent-bit
Verify the pod status by checking that the nsxi-platform-fluent-bit pod/pods up and running after the restart
k -n nsxi-platform get pods | grep nsxi-platform-fluent-bit
4. Validate Logs on Remote Syslog Server
Once both fluentd and fluent-bit pods are healthy, check the logs on the remote syslog server.
Confirm that logs are being forwarded correctly.
Specifically verify envoy MetricsMgr logs and GET 2XX/success response logs will not be forwarded.
This issue will be fixed in further releases