Vertica copycluster Fails with Permission Denied due to Localhost SSH Failure
search cancel

Vertica copycluster Fails with Permission Denied due to Localhost SSH Failure

book

Article ID: 440384

calendar_today

Updated On:

Products

Network Observability CA Performance Management

Issue/Introduction

When attempting to migrate a Vertica database using the copycluster command in a DX NetOps Performance Management environment, the process fails with a "Permission denied" error.

Review of the vbr logs (e.g., vbr_YYYY-MM-DD-XXXXXX.log) shows that while most nodes check in successfully, one or more nodes (often the source node where the command is executed) are missing the "vbr Checked" entry.

Environment

  • Product: DX NetOps Performance Management (Data Repository)
  • Version: All Supported Versions
  • Database: Vertica
  • OS: Red Hat Enterprise Linux

Cause

The vbr.py script requires the database administrator user (dradmin) to be able to SSH into all nodes in both the source and target clusters non-interactively. This includes the node's ability to SSH to itself (localhost/self-access) for worker orchestration.

If the local public key is missing from the authorized_keys file, or if SSH permissions/configurations prevent a local loopback connection in BatchMode, the worker processes cannot be initiated, resulting in a permission failure.

Resolution

Step 1: Validate the Failure

On the failing node (identified in the logs as missing the "Checked" status), run the following command as the dradmin user:

Review this command before running it.

bash
ssh -T -x -o BatchMode=yes <Node_IP_Address> hostname

Replace <Node_IP_Address> with the actual IP of the node you are logged into.

If this returns "Permission denied", the node cannot SSH to itself non-interactively.

Step 2: Fix Local SSH Access

To resolve the self-access failure, perform the following actions on the affected node:

  1. Update Authorized Keys: Append the node's own public key (~/.ssh/id_rsa.pub) to its ~/.ssh/authorized_keys file.
  2. Correct Permissions: Ensure strict permissions are set:
    • chmod 700 ~/.ssh
    • chmod 600 ~/.ssh/authorized_keys
  3. Clean Known Hosts: Remove any stale or conflicting entries for the local IP/hostname from ~/.ssh/known_hosts.
  4. Verify SSH Config: Ensure /etc/ssh/sshd_config allows local connections and restart the service if changes are made:
    • AllowTcpForwarding yes

Step 3: Verify the Fix

Re-run the validation command from Step 1. It should now return the node's hostname without prompting for a password. Once successful, retry the copycluster command.

Additional Information

Performance Management CopyCluster requirements for Vertica

DR backup, restore or copycluster script fails

Vertica copy cluster ports