Guidance is required for implementing High Availability and disaster recovery best practices for a clustered deployment to increase resiliency against potential outages and data loss.
VCF Operations for Networks
To maintain optimal resiliency, the product architecture requires periodic quiesced backups to ensure logically consistent checkpoint restore points, alongside proactive monitoring of GUI alerts to prevent transient issues from causing cascading system failures.
Establish a periodic backup cadence (e.g., weekly, monthly, and/or any other significant frequency for your organization).
Shut down the cluster to a logically consistent state by following all steps UP TO BUT NOT INCLUDING the step that begins with "Take snapshots ..." in the Resolution section of KB 314428 - Best practices to shutdown VCF Operations for Networks Clustered deployments.
Execute a full backup of all Platform and Collector Nodes using your standard backup regime.
Restore cluster operations by following all steps AFTER the step that begins with "Take snapshots ..." in KB 314428 - Best practices to shutdown VCF Operations for Networks Clustered deployments.
Routinely monitor the VCF Operations for Networks GUI by navigating to Settings > Infrastructure and Support > Infrastructure and Updates tab.
Investigate generated alerts. If persistent problems are flagged on the Platform or Collector nodes, capture screenshots of the details and open a Support Case proactively using the instructions at KB 142884 - Creating and managing Broadcom cases to address issues before they escalate.
For a simple (non-clustered) deployment, there is only one Platform node.
Therefore, the following is revised from the Resolution above:
Establish a periodic backup cadence (e.g., weekly, monthly, and/or any other significant frequency for your organization).
Shut down Collector node(s) using vCenter --> Power --> Shut Down Guest O/S action. If more than one Collector node, the sequence does not matter.
Execute a full backup of the Platform and Collector Node(s) using your standard backup regime.
Power on the Platform node
Routinely monitor the VCF Operations for Networks GUI by navigating to Settings > Infrastructure and Support > Infrastructure and Updates tab.
Investigate generated alerts. If persistent problems are flagged on the Platform or Collector nodes, capture screenshots of the details and open a Support Case proactively using the instructions at KB 142884 - Creating and managing Broadcom cases to address issues before they escalate.