Troubleshooting DSM managed PostgreSQL clusters which experience performance, connectivity, or hanging query issues
book
Article ID: 410984
calendar_today
Updated On:
Products
VMware Data Services Manager
Issue/Introduction
DSM-managed PostgreSQL clusters may experience various performance or connectivity issues that require in-depth system-level troubleshooting with direct visibility into PostgreSQL processes and their network namespace. One or more of the following symptoms may indicate the need for node-level diagnostics:
Clients cannot establish connections to PostgreSQL server, connection timeouts, or SSL handshake failures
Database connectivity and performance issue indicating network interface experiencing high load from potenitally few queries
PostgreSQL client processes consuming excessive CPU resources or causing high disk I/O load
Queries hang indefinitely without apparent database-level deadlocks or lock waits
Network Traffic Analysis Required: Need to monitor specific TCP sockets or network traffic patterns
Need real-time monitoring of system resources to identify performance bottlenecks
Environment
VMware Data Services Manager (DSM 9.0.1 onwards)
PostgreSQL clusters managed by DSM
Cause
DSM-managed PostgreSQL clusters may experience system-level issues that cannot be diagnosed through standard database monitoring tools. These issues require access to low-level system debugging utilities to analyze network traffic, process behavior, system calls, and resource utilization patterns.
Resolution
Overview
DSM provides a debugging tool (dsm-debug) that enables administrators to access command-line certain diagnostic utilities on PostgreSQL cluster node for in-depth troubleshooting. The tool creates ephemeral containers that share the same process and network namespace as the PostgreSQL nodes, providing full system-level visibility while maintaining security through command restrictions.
Prerequisites
Root SSH access to the DSM Appliance and DSM Admin privileges
Knowledge of cluster name and namespace
Important Notices
This is a feature intended for ad-hoc diagnostic purposes only
Not suitable for persistent monitoring and must not be used for automation
The CLI provides no backwards compatibility guarantees (CLI may change/be dropped/replaced)
The debug session share the same resources as the Postgres container - including memory, cpu, network.
The debug session commands that generate write disk or network IO are restricted
Connect to a specific PostgresSQL Cluster - for example cluster with name "my-pg" in namespace "my-namespace" : dsm-debug --data-service-name my-pg --namespace my-namespace
For multi node clusters. You can connect to a specific node by specifying --node-index (correspond to the index as seen in database nodes topology)