vDefend Distributed IDPS Performance Recommendations

Products

VMware Cloud on AWS VMware vDefend Firewall with Advanced Threat Prevention VMware vDefend Firewall VMware NSX Firewall

Issue/Introduction

This article focuses on the vDefend Security Platform's Distributed Intrusion Detection and Prevention (D-IDPS) component. The guidance documented here provides recommended practices for deploying DIDPS in a scalable and performant manner.

Environment

NSX-T

Resolution

What is IDPS?

IDS fundamentally matches patterns for signs of known attacks. vDefend IDPS inspects traffic specifically allowed IN or OUT via a Distributed Firewall policy and subsequently sent to the IDPS engine with a Policy. vDefend Distributed IDPS is a control point situated logically at every Virtual NIC (vNic) of all Virtual Machines (VMs).

Why is it essential to tune IDPS?

A vital aspect of any successful IDPS deployment is ensuring that the appropriate Profile and Policy align with the workload’s security requirements. As with any Security Control, a finite capacity is available for this traffic inspection. Care should be taken to ensure that the traffic sent to the IDPS engine is meaningful and will increase the overall security posture. Tuning should also assist in reducing the number of false positives in the environment.

The vDefend IPDS uses the following architecture:

Resource Requirements

Before enabling D-IDPS on a cluster, evaluate the cluster’s resource utilization using metrics such as Host CPU usage and CPU RDY% time for virtual workloads. Ensuring enough resources to support the IDPS Engine is a crucial first step to ensuring success. Enabling IDPS on a host will consume 2 GB of Host Memory and up to 5 CPU cores for inspecting the traffic. There is also an upper limit of approximately 150K PPS on the dvFilter channel per ESXi host that will limit the amount of traffic that can be effectively sent to the IDPS Engine.

Several alerts and Alarms are designed to raise awareness of resource usage on a Host (ESXi) basis:

Alarms	Trigger	Impact
IDPS Engine CPU Oversubscription High IDPS Engine CPU Oversubscription Very High	75% 90%	This indicates that IDPS's overall CPU requirement is becoming constrained, and it may have to start dropping or bypassing traffic.
NSX IDPS Engine Memory Usage High NSX IDPS Engine Memory Usage Medium High	75% 85%	This indicates that the overall memory required by IDPS is becoming constrained and may have to start dropping or bypassing traffic.
IDPS Engine Network Oversubscription High IDPS Engine Network Oversubscription Very High	75% 90%	This indicates that the dvFilter Channel is becoming congested and may have to start dropping or bypassing traffic.

Other alerts indicate that the system has entered Bypass Mode or Dropping Mode due to resource restrictions. The default action is to bypass when possible. (see the chart below that details the NSX Release version and capabilities)

What are the General recommended practices for a scalable and performant D-IDPS implementation?

When building a D-IDPS policy, you should limit the attack surface using the Distributed Firewall (DFW). Once a base security posture has been established, D-IDPS can be used to enhance it further. Consider the specifically allowed traffic in the DFW policy and choose from this traffic what can and should be further inspected by the IDPS engine.

Below is an example of using a DFW ruleset used to limit the traffic to the Production HR Application:

Next, we send to the IDPS engine the allowed traffic:

This only sends the HTTP and MySQL Traffic to the IDPS engine for analysis.

Bypass of Traffic when Resource Contention Exists

The IDPS Solution has added specific capabilities to address resource contention issues in newer releases. These capabilities involve what happens when a resource becomes overloaded. These resources are the CPU, Memory, and the dvFilter Channel for Network Capacity. The UI groups all three resources together, allowing users to choose Bypass or DROP. This selection is available on a Global level with a per-rule-specific override. Please note that the default is set to Bypass.

The following chart shows the Bypass capabilities of each release:

Version	CPU	Memory	Network Capacity
3.2.x	No	No	No
4.0.x	Yes	No	No
4.1.x	Yes	Yes	No
4.2.0	Yes	Yes	Yes *

*The version of ESXi must also be at least 8.0u3 for the network bypass to work. This bypass works on a per-packet basis.

If the Global setting is moved to Drop, there will be an Impact: In the event of oversubscription, the system will drop packets destined for the IDS Engine. There are no log messages or errors that report this condition.

Best Practices for IDPS Performance

1 Rules

When developing the IDPS rules or Policy, it is often best to start with a specific use case in mind. Avoid enabling IDPS on all traffic at the onset, as this could cause a resource starvation problem exhibited by increased Latency and higher rates of dropped packets.

- Start with the Security Requirements. Where is IDPS needed?
- First, implement segmentation to limit the attack surface with the DFW
- Exempt traffic that doesn’t need IDPS or cannot be inspected. This is quickly done by simply not configuring the IDS Policy to apply to hosts or groups on specific Port and Protocol combinations. Some examples not to send to the IDPS Engine could include:
  - Backup Traffic
  - Encrypted Flows (HTTPS, SSH)
  - DNS Traffic
  - N-S traffic that has already been inspected
- Use Applied-To and optionally SRC/DST to only apply IDPS to workloads or flows that need to be inspected.

2 Directionality

Suppose the Source and Destination workloads are virtualized workloads. In that case, inspecting the traffic as it leaves the source and when it arrives at the destination, scanning the same flow twice is possible. If throughput is a concern, consider using a single directionality (IN or OUT) to limit traffic redirected to the IDPS engine. This is done on a per-IDPS Rule basis.

For Server workloads:

IN direction can be sufficient if a proper segmentation policy has been applied to block outbound (external) flows and if all communication is inbound to workloads protected by D=IDPS

To allow outbound (external) communication, IDPS in OUT or IN/OUT direction can be applied in a granular rule

For VDI workloads:

– OUT direction can be sufficient if a segmentation policy has been applied that blocks inbound service access to VDI desktops

Note* Use directionality with care. Directionality is stateful and effectively determines when a flow entry (for redirection to IDPS) is created.

3 Profiles

Profiles are a collection of IDPS signatures applied to an IDPS Policy or Rule. The best practices here are:

- Start with Detect-only mode and move to detect and prevent after initial deployment and tuning
- Start with a broad coverage profile and wide application
  - Critical and High Severity Signatures
- Optionally, over time, Include only relevant signatures in a profile and apply the profile to specific workloads
  - Attack target: DNS Server, AD Server, Web Server
  - Affected product

What alerts and alarms are available that highlight any issues

Depending on the release of vDefend that is running, several alerts and alarms will be raised to highlight resource capacity issues. The following chart details the NSX Version and the associated capabilities from a ByPass or Drop:

Version	CPU	Memory	Network Capacity
3.2.x	Yes	No	No
4.0.x	Yes	Yes	No
4.1.x	Yes	Yes	No (Alarm Only)
4.2.0	Yes	Yes	Yes

General recommendations for ideal IDS/IPS performance:

- Reduce the amount of traffic redirected to IDPS. This can be done by leveraging the DFW to restrict all but the required traffic. You can future restrict traffic going to the IDPS engine by not including it in the IDS policy. Please note that traffic not inspected by the IDPS engine will still be allowed.
- Configure Directionality appropriately
- Consider the use of the ByPass functionality to ensure the inspection engine does not drop traffic