The memory utilization of the controller can increase over time:
From the CLI you can check what is consuming the memory with: top -o %MEM
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1577705 root 20 0 17.0g 14.2g 23632 S 0.0 30.1 27303:53 remote_task_man
If remote task manager is taking most of the memory you can double check the logs:
zgrep /var/lib/avi/log/remote_task_manager* | wc -l
1 node of the cluster will have most of the queries.
This can happen if API calls being made to the controller for the following are present:
/api/version/se
which can be checked under /var/log/nginx/portal.access.log* with the following command:
zgrep "/api/version/se" portal.access.log* | wc -l
You will see how many queries are being made.
In order to see who are the actual clients doing those queries you can use:
zgrep "/api/version/se" portal.access.log* | more
You will see queries like this:
portal.access.log:<Client IP goes here> [cache:-] 127.0.0.1:6000 [-] - - [03/Oct/2024:05:36:30 +0000] [-] [-] "GET /api/version/se HTTP/1.1"
GET API for version/se translates to 3 remote tasks through RTM to all SEs to get their version, patch_version and fips_mode. There is goroutine leak found in RTM server implementation and as there are remote tasks being executed over time the number of leaked goroutines increases.
Workaround: Stop the client from making those API calls to AVI, after that reboot the controller to release the consumed memory.
A proper fix will be delivered under 22.1.8 and 30.2.2 under this ID: AV-219698