vDefend SSP 5.0
The primary cause of this issue is that the volume of flows exceeds the system's processing and flow storage capacity. The SSP operates under the assumption that analyzed traffic flows will exhibit some level of repetition, allowing for a certain degree of aggregation. However, when the volume of flows is excessively high and there is significant uniqueness—such as an unusually large number of unique IP addresses or ports—the system is unable to compact the data efficiently. This leads to disk usage increasing at an accelerated rate.
The alarm "Analytics and Data Storage Disk Usage Growing Faster Than Expected" is triggered when SSP estimates that the disk will be unable to store traffic flow data for the required 30-day retention period.
Recommended Workaround: Scale out Analytics and Data Storage services
Traffic flows are stored across both the Analytics and Data Storage services. The Analytics service requires a minimum of five nodes to scale out, whereas the Data Storage service requires a minimum of eight nodes.
To determine the recommended number of worker nodes for the current traffic flow volume, utilize the SSP Sizing Tool. For detailed instructions on using this tool, refer to the relevant KB article: https://knowledge.broadcom.com/external/article/373793/security-intelligence-sizing-tool.html
Prerequisites:
Procedure
Note: The Scale Out action is only supported if you deployed the SSP using the Advanced form factor. The action is not supported for Evaluation form factor deployment.
If all of the services are scaled out already, the Scale Out button is disabled on pop up dialog. In this case, it indicates that your cluster nodes have reached the maximum number of nodes allocated. Initially, the advanced form factor is deployed with four nodes. You must first request for your infrastructure administrator to add four more nodes to your current cluster before you can continue with the next steps. To scale out all of the services, you must have a total of eight worker nodes in your cluster.
Unless specifically advised by the Broadcom support team, ensure that all of the core services are selected so that the system can decide which of the core services must be scaled out. Scaling out one core service arbitrarily can lead to more resources being used without any improvement to the system performance. Before proceeding with single-category service scale out procedure, consult the Broadcom support team or confirm that you know clearly what can happen if you scale out a single-category service.
The UI displays the progress of the scale out operation.
For reference please review the "Scale Out" section(WIP) the following guide: https://techdocs.broadcom.com/us/en/vmware-security-load-balancing/vdefend/security-services-platform/5-0.html
Other options:
Note: Please try the primary workaround to scale out first before trying to the following options.
If the recommended number of worker nodes exceeds the maximum supported limit, or if scaling out to the recommended size is not currently feasible, consider implementing the following options in the order presented.
option 1: Change to Dynamic Flow Data Retention
These are the strategies to manage flow data retention if the flow storage has reached its limits:
Procedures:
You can view the data retention period and number of existing flows on SSP UI.
Select System > Platform & Services > Metrics and scroll to the Druid Average Retention Days.
Select System > Platform & Services > Metrics and scroll to the Total Flows and Unique Flows.
option 2: Configure Data Collection in SSP
If you can identify the ESXi hosts and vSphere clusters with mostly East-West (EW) traffic, for example over 90% of traffic is EW and 10% is North-South (NS), you can enable data collection for those EW traffic first and gradually enable for NS. North-south traffic tends to have more unique IPs, which is more likely to adversely affect the data compaction.
Procedure:
By default, SSP collects network traffic data on all standalone hosts and clusters of hosts. If necessary, you can optionally stop data collection from a standalone host or cluster of hosts.
For reference please review the "Configure SSP Settings" section(WIP) in the following guide: https://techdocs.broadcom.com/us/en/vmware-security-load-balancing/vdefend/security-services-platform/5-0.html
option 3: Filter out broadcast and/or multicast flows.
Note: This option can be used where broadcast and/or multicast flows are not required for security policy or similar guidance. If broadcast and/or multicast flows are important to you, do not enable this option.
You can disable broadcast and/or multicast flows from getting stored in SSP to reduce disk usage.
This will only affect new flows which are not yet processed by SSP. Existing broadcast/multicast flows will still be visible, until the retention period (30 days) is reached.
To achieve this , please contact Broadcom Support for further assistance
option 4: Enable External IP aggregation & Optimize configuration of Private IP Ranges
If you have large volume of north-south traffic, but you don't need the details of individual external (public) IPs, you can reduce the amount of data sent to SSP by performing External IP aggregation at the host. This will aggregate all external IP addresses to one value: 255.255.255.255.
Note: The external (public) IP addresses that get affected are those outside the private IP ranges. Please refer to the section below to Optimize configuration of Private IP Ranges.
Impact:
This will affect how new external flows are stored and used in SSP.
To achieve this , please contact Broadcom Support for further assistance