Prometheus Target Status shows scrape as failed. The error message displayed is "body size limit exceeded".
Greenplum Command Center (GPCC) Metric Exporter, Prometheus Integration
The table_metrics endpoint provides high-resolution visibility into every table within the Greenplum database. To maintain consistency with GPCC, the exporter generates 25 distinct metric types per table (refer to official GPCC documentation for the list of metrics). Because of the volume of this data, it is served via a dedicated API endpoint (/table_metrics) with a recommended 5-minute scrape interval.
The fundamental issue lies in the transformation of data from a database format to a monitoring format:
gpmetrics.gpcc_table_info), the generated payload can reach upwards of 400 MB.Result: This high cardinality causes the resulting payload size to exceed the default safety limits configured in Prometheus, triggering the "body size limit exceeded" error and causing the scrape to fail.
To prevent Prometheus from dropping the metrics due to size constraints, you can manually increase the ingestion limit for the specific job.
You can edit your prometheus.yml configuration file, add or update the body_size_limit parameter. We recommend a value of 512MB (or higher depending on your count) to provide adequate headroom.
scrape_configs:
- job_name: 'greenplum_sandpit_table_inventory'
metrics_path: '/table_metrics'
scrape_interval: 5m
scrape_timeout: 1m
body_size_limit: 512MB # Increased to handle high cardinality
static_configs:
- targets: ['<hostname>:6162']
Note: if grafana instance encountered OOM after this tuning, refer to kb 433733 for workaround and resolution.
If you are using an Nginx reverse proxy, enabling Gzip compression is the most effective way to reduce network overhead. Since Prometheus metrics are text/plain, they are highly compressible. gzip can shrink a 400 MB payload down to approximately 30–40 MB for transmission.
This is optional but somehow essential, though it does not directly reduce the "body size", a gzip compression can reduce the possibility of scrape timeout as it result in smaller payload transmission.
Nginx Example Configuration:
location /prometheus {
proxy_pass http://127.0.0.1:9095;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
gzip on;
gzip_types text/plain; # Prometheus metrics are text/plain
gzip_proxied any;
gzip_min_length 1000;
}
location /table_metrics {
gzip on;
gzip_types text/plain; # Prometheus metrics are text/plain
gzip_proxied any;
gzip_min_length 1000;
}To verify the payload size and HTTP response code without loading it into your terminal, use the following command:
curl -sS -H "Authorization: Bearer xxxxx" \
--write-out '{"http_code":"%{http_code}","size_download_bytes":%{size_download}}\n' \
--output /dev/null \
http://<hostname>:6162/table_metrics