Log Analytics, logs-opensearch-master, fails to start and shows a CrashLoopBackOff error (exit code 78)
search cancel

Log Analytics, logs-opensearch-master, fails to start and shows a CrashLoopBackOff error (exit code 78)

book

Article ID: 379075

calendar_today

Updated On:

Products

DX Operational Intelligence DX Application Performance Management

Issue/Introduction

Log Analytics fails to start as the pods keep crashing, showing a CrashLoopBackOff

A describe of the pod (kubectl describe pod <pod> -n dxi) shows that the the container is continuously restarted until a back-off occurs:

Containers:
  logs-opensearch-master:
    Container ID:   containerd://[REDACTED]
    Image:          dxiregistry.[REDACTED]:5000/dxi/doi-loganalytics-opensearch:24.4.1.1
    Image ID:       dxiregistry.[REDACTED]:5000/dxi/doi-loganalytics-opensearch@sha[REDACTED]
    Ports:          9200/TCP, 9300/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    78
      Started:      Mon, 30 Sep 2024 16:45:07 +0200
      Finished:     Mon, 30 Sep 2024 16:45:20 +0200

...
Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  16m                 default-scheduler  Successfully assigned dxi/logs-opensearch-master-0 to [REDACTED]
  Normal   Pulled     14m (x5 over 16m)   kubelet            Container image "dxiregistry.[REDACTED]:5000/dxi/doi-loganalytics-opensearch:24.4.1.1" already present on machine
  Normal   Created    14m (x5 over 16m)   kubelet            Created container logs-opensearch-master
  Normal   Started    14m (x5 over 16m)   kubelet            Started container logs-opensearch-master
  Warning  BackOff    69s (x63 over 15m)  kubelet            Back-off restarting failed container logs-opensearch-master in pod logs-opensearch-master-0_dxi([REDACTED])

Environment

DX Platform 24.1 onPrem

Nodes on RHEL 8.10

Cause

Container logs (kubectl logs "pod-name" -c "container-name" -dxi), showed that a bootstrap check fails so the pod can not be started:

[2024-10-01T06:10:00,403][INFO ][o.o.n.Node               ] [logs-opensearch-master-0] initialized
[2024-10-01T06:10:00,404][INFO ][o.o.n.Node               ] [logs-opensearch-master-0] starting ...
[2024-10-01T06:10:00,522][INFO ][o.o.t.TransportService   ] [logs-opensearch-master-0] publish_address {192.168.xxx.xxx:9300}, bound_addresses {0.0.0.0:9300}
[2024-10-01T06:10:00,525][INFO ][o.o.t.TransportService   ] [logs-opensearch-master-0] Remote clusters initialized successfully.
[2024-10-01T06:10:01,023][INFO ][o.o.b.BootstrapChecks    ] [logs-opensearch-master-0] bound or publishing to a non-loopback address, enforcing bootstrap checks
ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
ERROR: OpenSearch did not exit normally - check the logs at /opt/opensearch/logs/logs-opensearch-master-0/loganalytics1_es.log
[2024-10-01T06:10:01,038][INFO ][o.o.n.Node               ] [logs-opensearch-master-0] stopping ...
[2024-10-01T06:10:01,039][INFO ][o.o.s.a.r.AuditMessageRouter] [logs-opensearch-master-0] Closing AuditMessageRouter
[2024-10-01T06:10:01,039][INFO ][o.o.s.a.s.SinkProvider   ] [logs-opensearch-master-0] Closing DebugSink
[2024-10-01T06:10:01,052][INFO ][o.o.n.Node               ] [logs-opensearch-master-0] stopped
[2024-10-01T06:10:01,053][INFO ][o.o.n.Node               ] [logs-opensearch-master-0] closing ...
[2024-10-01T06:10:01,062][INFO ][o.o.s.a.i.AuditLogImpl   ] [logs-opensearch-master-0] Closing AuditLogImpl
[2024-10-01T06:10:01,075][INFO ][o.o.n.Node               ] [logs-opensearch-master-0] closed

Resolution


The value of the max number of virtual memory areas a process may have (vm.max_map_count) has to be increased by following these steps:


- Check the current configuration in the node:

   cat /proc/sys/vm/max_map_count
 
- If the map count is not 262144 or higher, update the file /etc/sysctl.conf, adding this:

   vm.max_map_count=262144
 
- Run the following command to apply the changes without restarting the node:

   sysctl -q -w vm.max_map_count=262144




 

Additional Information

It is recommended to repeat the same procedure on all nodes to which the log analytics may be attached.

More details about the max_map_count:

https://access.redhat.com/solutions/99913