This Knowledge Base (KB) article details a known memory leak in nats version 2 (v2) service running on the "nats" instance group in Tanzu Application Service for VMs (TAS).
Starting in nats release v46 the underlying package nats-server has been updated from v1 to v2. This change has been introduced in the following TAS versions:
For more information on how to confirm your nats service is running on v2 please see this KB article.
The following patterns can be observed when this nats memory leak is present:
BOSH metrics
One node will show higher memory consumption and never releases the memory.
$ bosh -d cf-76cf11c200725fcf4b1e vms --vitals --column={Instance,"Memory Usage"} | grep nats nats/2e0f7951-9990-418a-8520-bed481b3e10f 22% (444 MB) nats/565fb9e7-52de-49f4-958e-a8fa3ba6594e 36% (728 MB) nats/869453b5-fe28-4c65-b7c6-510925e3569b 21% (416 MB)
OS metrics
OS level commands such as htop will show the memory is consumed from the nats-wrapper jobs.
Monit summary
Monit will show the nats-wrapper jobs as failing if the memory exceeds the healthy threshold.
nats/565fb9e7-52de-49f4-958e-a8fa3ba6594e:~$ sudo monit summary The Monit daemon 5.2.5 uptime: 6d 4h 21m Process 'nats-wrapper' Resource limit matched Process 'nats-tls-wrapper' Resource limit matched Process 'nats-tls-healthcheck' running Process 'loggregator_agent' running Process 'loggr-syslog-agent' running Process 'metrics-discovery-registrar' running Process 'metrics-agent' running Process 'loggr-forwarder-agent' running Process 'prom_scraper' running Process 'bosh-dns' running Process 'bosh-dns-resolvconf' running Process 'bosh-dns-healthcheck' running Process 'system-metrics-agent' running