Tenant-Level Pushback and Proxy Backlog Behavior in DX OpenExplore (Wavefront)

search cancel

Tenant-Level Pushback and Proxy Backlog Behavior in DX OpenExplore (Wavefront)

book

Article ID: 430013

calendar_today

Updated On:

Products

DX OpenExplore

Issue/Introduction

Customers may occasionally observe the following symptoms:

Increased pushback rate
Growing proxy backlog (queued points)
Elevated median queue time
Temporary ingestion lag (P95/P99 delays)
False alerts due to delayed metric delivery

In many cases, this behavior is caused by a burst of points exceeding the tenant ingestion limit, not a backend outage.

Environment

DX OpenExplore (Wavefront)

Cause

DX OpenExplore enforces a tenant-level ingestion limit to:

Protect backend stability
Ensure fair resource allocation
Prevent unexpected billing spikes

All proxies within a tenant share the same ingestion budget.

If one proxy sends a sudden burst of traffic that exceeds the contractual ingestion rate:

Points are queued.
Backend pushback is applied.
Throttling may appear across all proxies.

Although the burst originates from a single proxy, enforcement occurs at the tenant level — making the impact appear system-wide.

Resolution

Common Observations During an Event

Pushback rate increases sharply.
Backlog grows (points and tasks queued).
Median queue time rises.
Data received lag (P95/P99) increases.
Alerts trigger due to delayed metric arrival.

Once traffic returns to normal levels, the system self-recovers and the queue drains automatically.

Why It Self-Resolves

The condition resolves when:

The burst traffic finishes
The proxy rate drops below the ingestion limit
The queued points are processed

No backend restart or intervention is typically required.

How to Investigate

If this occurs, check:

Proxy-level ingestion rate (look for sudden spikes).
Whether one proxy significantly exceeded normal throughput.
Whether traffic patterns changed (pipeline flush, retry storm, replay, network delay).

Review proxy logs for connection resets, retries, buffer replays, upstream service restarts.

How to Prevent Recurrence

1. Configure `pushRateLimit` on Proxies

You can configure a maximum rate per proxy to prevent a single proxy from consuming the entire tenant ingestion budget. This limits burst amplification and protects overall ingestion stability. Refer to the official proxy configuration documentation for details.

2. Monitor Burst Patterns

Track Points per second (PPS), Queue size trends, Pushback rate. Set internal alerts for abnormal spikes before they hit the tenant limit.

3. Keep Proxies Updated

Upgrade to the latest proxy release to avoid stale queue reporting artifacts and benefit from stability improvements.

Additional Information

Engage DX OpenExplore (Wavefront) Support if:

Ingestion lag persists after traffic normalizes
Backlog does not drain
Multiple tenants on the same cluster show impact simultaneously
You suspect backend-wide instability

Tenant-wide pushback across all proxies can be triggered by a burst from a single proxy because ingestion limits are enforced at the tenant level. This behavior is expected and protective in nature. Proper rate limiting and monitoring help prevent recurrence and minimize alert noise.

Feedback

thumb_up Yes

thumb_down No