"Segmentation fault" and Server Crashes Due to SDM Deadlock in DX NetOps
search cancel

"Segmentation fault" and Server Crashes Due to SDM Deadlock in DX NetOps

book

Article ID: 435942

calendar_today

Updated On:

Products

Network Observability Spectrum

Issue/Introduction

The SpectroSERVER crashes daily without generating a core dump following a post-migration․​​​​‌​‍

ERROR MESSAGE: "read_attr error: No value on the scratchpad returning Internal Failure !!" "Low thread resources detected․" "Signal: SIGSEGV, Segmentation fault․"

SYMPTOMS:

  • The SpectroSERVER application terminates due to a Segmentation Fault in the symbol printing routine․

  • A massive accumulation of threads becomes blocked waiting for resources within the Secure Domain Manager (SDM), indicating a deadlock․

  • The sdmlog․log file displays repeated errors such as "Error setting TCP_NODELAY"․

CONTEXT: This crash typically occurs during a polling cycle when multi-model requests are initiated․

IMPACT: The server becomes completely unresponsive, leading to daily service disruptions․

Environment

any supported Spectrum release

Resolution

1․ VERIFY AND DISABLE SECURE DOMAIN MANAGER

Determine if the Secure Domain Manager (SDM) functionality is intentionally enabled and required for your environment․

If not needed, disable the SDM functionality in the configuration file to prevent threads from spawning and deadlocking․

EXPECTED: The SDM connector stops, preventing the thread deadlock condition․

2․ VERIFY CAPKI INSTALLATION

Check the CAPKI libraries on the server․

Ensure they are correctly installed and match the required version for your specific release․

3․ INCREASE WORKER THREAD COUNT

Navigate to SpectroSERVER control -> Thread Information on the VNM model․

Increase the thread count from 4000 to 8000․

Recycle the SpectroSERVER․

EXPECTED: The server can process excessive polling requests without exhausting thread resources․

4․ REMOVE DEBUG OPTIONS

Remove any active debug options (such as MDL actor debug) from the VNMRC file․

Restart the SpectroSERVER․

EXPECTED: Server performance stabilizes without the overhead of excessive logging․

5․ INCREASE CPU CAPACITY

Increase the CPU capacity (e․g․, from 8 to 10) on the non-working landscape to match the active polling load of working environments․

EXPECTED: System resources align with active polling requirements․

 

VERIFY SUCCESS:

  • The SpectroSERVER remains stable and does not crash during routine polling cycles․

  • The VNM․OUT log no longer reports "Low thread resources detected․"

Additional Information

The crash is triggered by a thread contention or deadlock issue within the Secure Domain Manager (SDM) encryption subsystem․ Threads become permanently blocked while attempting to acquire a mutex lock, eventually exhausting system resources and causing a segmentation fault when the system attempts to dump its diagnostic state․