VMware SD-WAN Bandwidth Measurement Modes, Best Practices, and Limitations.
search cancel

VMware SD-WAN Bandwidth Measurement Modes, Best Practices, and Limitations.

book

Article ID: 326451

calendar_today

Updated On:

Products

VMware SD-WAN by VeloCloud

Issue/Introduction

This Article covers how bandwidth measurement is performed on a WAN link using the VMware SD-WAN service.  This is followed with guidance on configuration and restrictions of the same. 


Environment

VMware SD-WAN by VeloCloud

Resolution

Once a WAN link is detected by the VMware SD-WAN Edge, it first establishes DMPO (Dynamic Multi-Path Optimization) tunnels with one or more VMware SD-WAN Gateways and performs a bandwidth test with the Primary Gateway. The bandwidth test is performed by sending a stream of bidirectional UDP traffic and measuring the received rate at each end.

In addition, if the Edge is deployed as a Spoke in a Hub/Spoke topology, the Edge will also establish tunnels with the Hub Edge and perform a bandwidth test if configured to do so.

There are three modes of Bandwidth measurement are available in VMware SD-WAN.

Slow Start Mode

In Slow Start mode, the Edge sends a smaller burst of UDP traffic followed by a larger burst of UDP traffic to the VMware SD-WAN Gateway. Based on the number of packets received by the Gateway, the Gateway calculates the WAN link's speed. In Slow Start mode, the Edge sends this traffic for a fixed duration of 5 seconds. In the first 3 seconds, the Edge sends the UDP traffic at a rate of 5000 packets per second, and for the remaining 2 seconds it sends the traffic at 20000 packets per second. The packet size of this UDP traffic matches the MTU size for that WAN link.

Slow start mode is configured by default for wired links. The Edge sends a steady stream of packets for a short period of time (in case the ISP is throttling the beginning of a session) and then ramps up to a 200 Mbps stream and measures how much is received.

Note: Because of the way Slow Start works, the max measurable rate is 200 Mbps in either direction. In Edge software Release 3.3.0+, if the Edge measures 175 Mbps or greater (in upload bandwidth) with Slow Start, the Edge will automatically switch to Burst Mode.
 
The reason we do this is because there are some ISPs who need packet rates to be ramped up slowly before they allow the full packet rate as part of the link SLA.

Burst Mode

In Burst mode, the Edge sends the UDP packets as single burst (A fixed, high number of packets in one burst) to the Gateway. Based on the number of packets received by the Gateway, the Gateway calculates the speed. It will start the round with 416 packets. If the Gateway response mentions that the packets were received in a very short interval, it will restart with 2000 packets. The packet size of this UDP traffic is the link MTU size.

Burst mode is configured by default for wireless links.  The Edge sends a burst of 6.25 MB to the Gateway and measures how much was received and how long it took. Based on the Gateway's response, the Edge will adjust the size to make the burst take 0.5 seconds and then send a second burst. The Edge adjusts again and sends a third burst. Based on how much of the third burst is received and how long it takes, the bandwidth is then set for that link.

Note: Burst Mode is effective at measuring a WAN link up to 900 Mbps in either direction. A WAN link with either an upload or download capacity greater than 900 Mbps should be manually configured using User Defined Mode.

User Defined Mode (Define Manually)

In this mode, the user can configure the WAN link bandwidth manually in the Orchestrator UI. User Defined Mode is recommended for the following uses:

  • For WAN links with greater than 900 Mbps capacity (either upload or download).
  • For WAN links on Edges being used as Hubs. (This applies to hubs or any edge with a high number of tunnels.)
  • On private links like MPLS, it is recommended to configure the link with a user defined value because a private link has to perform a bandwidth measurement test with every other private link in the customer's network.
    • For example in a network with multiple private links where the private peer link bandwidth values are 5 Mbps, 1 Mbps, and 500 Kbps respectively. The private link would do a bandwidth test to each of those private peer links, and may also end up measuring at the lowest peer link value. In a large network with a large number of private links, this would also be undesirable as each bandwidth measurement takes up link resources.
  • If the bandwidth measurement is failing for that WAN link and no value is being registered for that link.
  • Some other user preference such as deliberately limiting how much of the link capacity is used by the Edge.

The bandwidth measurement modes are configured through the VMware Orchestrator:

 Configure → select Edge → Device → WAN Settings → Edit → Advanced → Bandwidth Measurement


Important Notes and Limitations 

●USB modems are not compatible with the slow start mode of measurement. The recommended bandwidth measurement mode for USB modem is “Burst Mode” (which is configured by default) and for wired WAN links “Slow Start” is recommended (which is configured by default).
 
●The Dynamic Bandwidth adjustment is recommended on links where available bandwidth can vary over time (especially wireless links). This setting will track WAN congestion and packet loss and adjust bandwidth down and up as needed. To avoid inducing congestion, bandwidth will never be adjusted to be higher than the originally measured value.
 
●Bandwidth is only measured to the local Gateway path unless the Edge is also a Spoke Edge in a Hub/Spoke topology. In that case bandwidth is also measured between the Spoke Edge and the Hub Edge.

●In a Hub/Spoke topology where the Hub Edge and a connected Spoke Edge have different bandwidth measurement modes configured (for example, the Hub Edge WAN link is configured with a user defined mode but the Spoke Edge's WAN link is configured with either Slow Start or Burst mode), a link measurement will be performed. However, VMware SD-WAN will honor the user defined value if the measured value is greater than the user defined value. This explains why a customer can observe bandwidth measurement events on a Hub Edge even though the Hub Edge's WAN links are configured to not measure bandwidth with a user defined mode.
 
●When the path to the local Gateway is being measured the rest of the paths are in WAITING_FOR_LINK_BW. Once the measurement to the local Gateway path is done, the rest of the paths update their values and exchange it with their peer. This is also true when the Hub Edge is being measured by a Spoke Edge in a Hub/Spoke topology.
 
●The wireless links always default to Burst Mode of measurement
 
● For wired links the cache is updated only on a successful measurement and this value is valid for 7 days. Bandwidth is only measured if a tunnel flaps or comes up and there is no cache or if there is a value in the cache but the last measurement was 7 days back. Wireless links have a similar behavior, but in their case the cache only needs to be older than 24 hours, and there needs to be a tunnel flap in order to trigger another bandwidth remeasurement.
 
●If the Automatic bandwidth measurement fails for some reason, a user can trigger a bandwidth measurement manually from the Orchestrator UI by navigating to Test & Troubleshoot → WAN Link Bandwidth Test
 

● If the Automatic bandwidth measurement measures less than 90% of the originally measured(cached) value, it will not update the bandwidth.  For example this will happen if you have a 1Gig link and downgrade it to a 500Mbps link, the bandwidth measurement will continue giving the old value of 1Gig.  To work around this, VMware support team will need to be engaged to delete the cached bandwidth measurement, then a new "WAN Link Bandwidth Test" can be ran from Remote Diagnostics.