VMware Cloud Foundation NSX-T Edge Node Virtual Machine Resizing Tool
search cancel

VMware Cloud Foundation NSX-T Edge Node Virtual Machine Resizing Tool

book

Article ID: 312196

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

Overview

This tool allows a user to resize all Edge node VMs in a single VCF-created Edge cluster. This may be helpful if the user finds that they could use more resources in an existing Edge cluster, but they don't wish to expand the Edge cluster by adding more Edge nodes. This may also be used to ensure that all VMs share the same form factor, if any Edge node VMs have been manually added (not recommended).

The intent of this tool is that it only resizes Edge node VMs and, if necessary, their containing resource pools. All other NSX Edge node and Edge cluster parameters should remain unchanged. However, if an Edge node has been moved out of and back into a resource pool, this tool will correct the Edge node's NSX compute_id during redeploy, so that VCF Edge cluster expansion is not blocked. For special situations, the tool may also be used to redeploy Edge node VM(s) even if their form factor will not be changed.

Environment

VMware Cloud Foundation 4.4
VMware Cloud Foundation 4.2.1
Vmware Cloud Foundation 4.5
Vmware Cloud Foundation 4.5.1
VMware Cloud Foundation 5.0
VMware Cloud Foundation 4.3

Resolution

Caution

Edge node resizing is accomplished by deleting each Edge node VM in turn and recreating it using the desired form factor. Edge nodes are resized one at a time, but this still means that there will be a period of time during the resize where each Edge node's Tier-0 interfaces will be offline.

Likewise, if the Edge cluster has more Tier-1 routers than can fit with one Edge node VM offline, then some of those Tier-1s will also be unavailable until the resize is completed.

Please do not start any SDDC Manager UI workflows while running the resize utility, or make other changes directly via the NSX Manager or vCenter UIs.

Usage

The resizing tool provides three basic operations:

  • List an Edge cluster's current node form factors (--list).
  • Resize an Edge cluster's node VMs to use a specified form factor (--form-factor).
  • Roll back an Edge cluster's node VMs to their form factors from before the most recent resize attempt (--rollback)

The Edge node resizing tool is provided as a script. It is written to be run directly on the SDDC Manager VM. The resizing tool is delivered as a tarball which needs to be copied to a suitable location and unpacked: typically under the "vcf" user's home directory in the SDDC Manager.

In order to run, the Edge node resizer requires the single-signon credentials for the SDDC Manager. This is because the resizer uses a mix of public and private SDDC Manager APIs, and in particular uses secured VCF APIs to query the other credentials it requires (NSX-T and vCenter). The SDDC Manager credentials may be provided via command line parameters or environment variables. The tool will also prompt for the password if it is not otherwise supplied. To set credential environment variables, enter the following commands at the SDDC Manager VM shell session you are using to run the resizer:

export SDDC_SSO_USERNAME=your_SSO_username
export SDDC_SSO_PASSWORD=your_SSO_password

Use of the SDDC_SSO_PASSWORD variable is generally not recommended however, since some SDDC Manager VM versions default to having command history enabled.

There is also a --dryrun command line option which is worth noting. When this option is given, the Edge node resizer runs as it otherwise would but only simulates the various update operations it would perform. This allows the user to safely verify their credentials and to see what the resizer would do with a given set of command-line options.

Post-Checks

Although the script endeavors to leave the Edge cluster configuration much as it was found, aside from Edge node sizes, it is a good idea to also verify this after the tool is run:

  1. In NSX, check that the Edge nodes appear healthy. One way is via the NSX UI,
    System > Fabric > Nodes > Edge Transport Nodes.
  2. In vCenter, check that all vSphere clusters containing multiple resized Edge nodes still have anti-affinity rules for those VMs. In the vCenter UI inventory view, for each such vSphere cluster click on it, then under Configuration click on VM/Host Rules. You should find a "VCF-edge_<cluster name>..." anti-affinity rule. Click on it and the details view should list all related Edge nodes from the <cluster name> which you are resizing.
  3. In vCenter, if the Edge cluster's nodes are in a vSAN stretched Edge cluster, please verify that the redeployed Edge node VMs are listed in the host-affinity group rules for their intended stretch AZ location.

Rollback

If the resizing tool detects that one of the Edge node VMs it is redeploying has failed to resume operation, the tool will automatically attempt to redeploy that Edge node using the form factor it had before the resize attempt.

Whether the second redeploy of the Edge node is successful or not, the resizing utility stops afterwards to avoid making any further changes to the system. At this point, the Edge cluster is likely to have Edge node VMs with different form factors.

It is important not to leave an Edge cluster with Edge nodes of differing form factors: traffic in an Edge cluster is distributed across Edge nodes without regard for their form factor, so the smaller Edge nodes might be overloaded.

If the issue which blocked redeploy of an Edge node at the desired form factor cannot be resolved, then the resizing tool also provides a --rollback option to roll back the sizes of all Edge node VMs to what they were when the Edge cluster resize attempt was initiated. In a typical VCF-created Edge cluster, this should leave all Edge node VMs with the same form factor.

The cache used to support rollback is kept in the:

~/.vcf-edge-redeploy

subdirectory of the user account which is running the resizing script. A separate json file is kept there for each Edge cluster which the resizer tool has been run against. An Edge cluster's cache file is normally updated each time a new resize operation (requested via the --form-factor option) is requested for that Edge cluster, either live or with the --dryrun option. A cache file will not be updated if the previous resizer run (either resize or rollback) was incomplete. This is done since the needed cache content might not be otherwise available after an incomplete resize or rollback attempt.

Invocation

If accessing the SDDC Manager VM via ssh, to avoid ssh timeouts it is suggested to first add the following lines to your client's ~/.ssh/config:

Host a.b.c.d
     ServerAliveInterval 15
     ServerAliveCountMax 3

where a.b.c.d is the IP address of your SDDC Manager VM. If using a non-linux client, please make the equivalent settings to ensure that your client sends periodic ssh updates to let the SDDC Manager know it is still present.

Then after copying and unpacking the supplied resizer tarball, run resize/resize.sh.

You can display usage information like so:

resizer/resize.sh --help
VCF Edge node resizer tool, version 0.5
usage: resize.sh [-h] [--edge-cluster EDGE_CLUSTER] [--workload WORKLOAD]
                 [--edge-node EDGE_NODE] [--user USER] [--password PASSWORD]
                 [--form-factor {SMALL,MEDIUM,LARGE,XLARGE}] [--rollback]
                 [--force] [--list-sizes] [--dryrun] [--verbose]

Resize a VCF Edge cluster's Edge node VMs
optional arguments:
  -h, --help            show this help message and exit
  --edge-cluster EDGE_CLUSTER, -c EDGE_CLUSTER
                        Name of Edge cluster whose nodes are to be resized.
  --workload WORKLOAD, -w WORKLOAD
                        Name of VCF workload in which target Edge Cluster
                        resides. Not normally required.
  --edge-node EDGE_NODE, -n EDGE_NODE
                        Name of single Edge node VM to resize. If not given
                        then all nodes in the Edge cluster are resized.
  --user USER, -u USER  Name of single-signon admin user to authenticate as.
  --password PASSWORD, -p PASSWORD
                        Password for specified user.
  --form-factor {SMALL,MEDIUM,LARGE,XLARGE}, -f {SMALL,MEDIUM,LARGE,XLARGE}
                        Select form-factor to use for redeployed Edge nodes.
  --rollback            Restore Edge cluster node VMs to their original sizes
                        before resizing was attempted.
  --force               Redeploy Edge cluster node VM(s) even if their sizes
                        won't change.
  --list-sizes, -l      List form-factor sizes of existing Edge node VMs in
                        Edge cluster.
  --dryrun, -d          Compute and report but do not apply changes.
  --verbose, -v         Provide extra output detail on the command console.
                        The log file is always written with the verbose level
                        of detail.

Background

When VCF SDDC Manager creates or expands an NSX-T Edge cluster, it always does so using Edge node VMs. Each Edge node VM has a "form factor" setting which controls how many vSphere resources that VM can consume. VCF itself always uses a single, user-specified, form factor for all Edge node VMs in a single Edge cluster. But if Edge node VMs have been manually added they might potentially have a different form factor, which is not recommended for traffic distribution purposes. Or it may be that system requirements increase as more uses are found for an existing Edge cluster.

One possibility is to add additional Edge node VMs: in VCF 4.3 and later SDDC Manager provides automation for this.

Another possibility is to increase the size of existing Edge node VMs in an Edge cluster. This may be done while preserving most existing Edge node state by using an NSX API supplied for this purpose.

SDDC Manager does not itself provide automation for the resizing process, so this script fills the gap: in addition to invoking the resize operations for each Edge node, this script also takes care of increasing Edge node resource pools, and restoring the resized Edge node VMs to any anti-affinity and host-affinity rules which they were previously listed in.

Note that this script only increases resource pool sizes, and only when needed. It never shrinks resource pools, even if a user requests a resize to a smaller form factor, or to roll back a previous resize operation.

Operations the Resizer Performs

While the main aim of the resizer tool is to apply the NSX "redeploy" API to each Edge node in the user-selected Edge cluster, several other steps are needed in order to facilitate this. At a high level, the sequence of operations performed during a resize run is:

  1. Save state of the Edge cluster's Edge node VMs before making any changes. This includes their NSX transport node definitions, and any vSphere cluster anti-affinity and host-affinity rule memberships the Edge node VMs might have.
  2. If any child resource pools are used to hold Edge node VMs, the resource pool sizes are increased if needed in order to accommodate Edge node VMs with their new target form factor. The resizer only increases resource pool sizes when needed, it never decreases them. (VCF typically deploys Edges using child resource pools.)
  3. Perform the following steps for each Edge node VM in turn. Note that if an Edge node already has the desired form factor, then unless the --force option is given the Edge node is left as-is and the resizer moves on to the next Edge node. For each Edge node which does need resizing (or when --force is given):
    - Read the Edge node's account credentials from the SDDC Manager credentials store
    - Include the existing account credentials as part of the NSX redeploy request
    - Wait for the Edge node VM to return to normal operating condition, as seen by NSX Manager
    - Add the redeployed Edge node VM back into any anti-affinity and host-affinity rules it was previously a member of

As noted elsewhere, if any Edge node redeploy attempt fails, the resizer will attempt to restore that Edge node only to its former size, and the resizer then stops.

Attachments

edge_cluster_node_resize_0.8.tar get_app