Overview
This tool allows a user to resize all Edge node VMs in a single VCF-created Edge cluster. This may be helpful if the user finds that they could use more resources in an existing Edge cluster, but they don't wish to expand the Edge cluster by adding more Edge nodes. This may also be used to ensure that all VMs share the same form factor, if any Edge node VMs have been manually added (not recommended).
The intent of this tool is that it only resizes Edge node VMs and, if necessary, their containing resource pools. All other NSX Edge node and Edge cluster parameters should remain unchanged. However, if an Edge node has been moved out of and back into a resource pool, this tool will correct the Edge node's NSX compute_id during redeploy, so that VCF Edge cluster expansion is not blocked. For special situations, the tool may also be used to redeploy Edge node VM(s) even if their form factor will not be changed.
Caution
Edge node resizing is accomplished by deleting each Edge node VM in turn and recreating it using the desired form factor. Edge nodes are resized one at a time, but this still means that there will be a period of time during the resize where each Edge node's Tier-0 interfaces will be offline.
Likewise, if the Edge cluster has more Tier-1 routers than can fit with one Edge node VM offline, then some of those Tier-1s will also be unavailable until the resize is completed.
Please do not start any SDDC Manager UI workflows while running the resize utility, or make other changes directly via the NSX Manager or vCenter UIs.
UsageThe resizing tool provides three basic operations:
The Edge node resizing tool is provided as a script. It is written to be run directly on the SDDC Manager VM. The resizing tool is delivered as a tarball which needs to be copied to a suitable location and unpacked: typically under the "vcf" user's home directory in the SDDC Manager.
In order to run, the Edge node resizer requires the single-signon credentials for the SDDC Manager. This is because the resizer uses a mix of public and private SDDC Manager APIs, and in particular uses secured VCF APIs to query the other credentials it requires (NSX-T and vCenter). The SDDC Manager credentials may be provided via command line parameters or environment variables. The tool will also prompt for the password if it is not otherwise supplied. To set credential environment variables, enter the following commands at the SDDC Manager VM shell session you are using to run the resizer:
export SDDC_SSO_USERNAME=your_SSO_username
export SDDC_SSO_PASSWORD=your_SSO_password
Use of the SDDC_SSO_PASSWORD variable is generally not recommended however, since some SDDC Manager VM versions default to having command history enabled.
There is also a --dryrun command line option which is worth noting. When this option is given, the Edge node resizer runs as it otherwise would but only simulates the various update operations it would perform. This allows the user to safely verify their credentials and to see what the resizer would do with a given set of command-line options.
Post-Checks
Although the script endeavors to leave the Edge cluster configuration much as it was found, aside from Edge node sizes, it is a good idea to also verify this after the tool is run:
Rollback
If the resizing tool detects that one of the Edge node VMs it is redeploying has failed to resume operation, the tool will automatically attempt to redeploy that Edge node using the form factor it had before the resize attempt.
Whether the second redeploy of the Edge node is successful or not, the resizing utility stops afterwards to avoid making any further changes to the system. At this point, the Edge cluster is likely to have Edge node VMs with different form factors.
It is important not to leave an Edge cluster with Edge nodes of differing form factors: traffic in an Edge cluster is distributed across Edge nodes without regard for their form factor, so the smaller Edge nodes might be overloaded.
If the issue which blocked redeploy of an Edge node at the desired form factor cannot be resolved, then the resizing tool also provides a --rollback option to roll back the sizes of all Edge node VMs to what they were when the Edge cluster resize attempt was initiated. In a typical VCF-created Edge cluster, this should leave all Edge node VMs with the same form factor.
The cache used to support rollback is kept in the:
~/.vcf-edge-redeploy
subdirectory of the user account which is running the resizing script. A separate json file is kept there for each Edge cluster which the resizer tool has been run against. An Edge cluster's cache file is normally updated each time a new resize operation (requested via the --form-factor option) is requested for that Edge cluster, either live or with the --dryrun option. A cache file will not be updated if the previous resizer run (either resize or rollback) was incomplete. This is done since the needed cache content might not be otherwise available after an incomplete resize or rollback attempt.
Invocation
If accessing the SDDC Manager VM via ssh, to avoid ssh timeouts it is suggested to first add the following lines to your client's ~/.ssh/config:
Host a.b.c.d
ServerAliveInterval 15
ServerAliveCountMax 3
where a.b.c.d is the IP address of your SDDC Manager VM. If using a non-linux client, please make the equivalent settings to ensure that your client sends periodic ssh updates to let the SDDC Manager know it is still present.
Then after copying and unpacking the supplied resizer tarball, run resize/resize.sh.
You can display usage information like so:
resizer/resize.sh --help
VCF Edge node resizer tool, version 0.5
usage: resize.sh [-h] [--edge-cluster EDGE_CLUSTER] [--workload WORKLOAD]
[--edge-node EDGE_NODE] [--user USER] [--password PASSWORD]
[--form-factor {SMALL,MEDIUM,LARGE,XLARGE}] [--rollback]
[--force] [--list-sizes] [--dryrun] [--verbose]
Resize a VCF Edge cluster's Edge node VMs
optional arguments:
-h, --help show this help message and exit
--edge-cluster EDGE_CLUSTER, -c EDGE_CLUSTER
Name of Edge cluster whose nodes are to be resized.
--workload WORKLOAD, -w WORKLOAD
Name of VCF workload in which target Edge Cluster
resides. Not normally required.
--edge-node EDGE_NODE, -n EDGE_NODE
Name of single Edge node VM to resize. If not given
then all nodes in the Edge cluster are resized.
--user USER, -u USER Name of single-signon admin user to authenticate as.
--password PASSWORD, -p PASSWORD
Password for specified user.
--form-factor {SMALL,MEDIUM,LARGE,XLARGE}, -f {SMALL,MEDIUM,LARGE,XLARGE}
Select form-factor to use for redeployed Edge nodes.
--rollback Restore Edge cluster node VMs to their original sizes
before resizing was attempted.
--force Redeploy Edge cluster node VM(s) even if their sizes
won't change.
--list-sizes, -l List form-factor sizes of existing Edge node VMs in
Edge cluster.
--dryrun, -d Compute and report but do not apply changes.
--verbose, -v Provide extra output detail on the command console.
The log file is always written with the verbose level
of detail.
Background
When VCF SDDC Manager creates or expands an NSX-T Edge cluster, it always does so using Edge node VMs. Each Edge node VM has a "form factor" setting which controls how many vSphere resources that VM can consume. VCF itself always uses a single, user-specified, form factor for all Edge node VMs in a single Edge cluster. But if Edge node VMs have been manually added they might potentially have a different form factor, which is not recommended for traffic distribution purposes. Or it may be that system requirements increase as more uses are found for an existing Edge cluster.
One possibility is to add additional Edge node VMs: in VCF 4.3 and later SDDC Manager provides automation for this.
Another possibility is to increase the size of existing Edge node VMs in an Edge cluster. This may be done while preserving most existing Edge node state by using an NSX API supplied for this purpose.
SDDC Manager does not itself provide automation for the resizing process, so this script fills the gap: in addition to invoking the resize operations for each Edge node, this script also takes care of increasing Edge node resource pools, and restoring the resized Edge node VMs to any anti-affinity and host-affinity rules which they were previously listed in.
Note that this script only increases resource pool sizes, and only when needed. It never shrinks resource pools, even if a user requests a resize to a smaller form factor, or to roll back a previous resize operation.
Operations the Resizer Performs
While the main aim of the resizer tool is to apply the NSX "redeploy" API to each Edge node in the user-selected Edge cluster, several other steps are needed in order to facilitate this. At a high level, the sequence of operations performed during a resize run is:
As noted elsewhere, if any Edge node redeploy attempt fails, the resizer will attempt to restore that Edge node only to its former size, and the resizer then stops.