Error while registering or pushing metrics using metrics ingestion API
search cancel

Error while registering or pushing metrics using metrics ingestion API

book

Article ID: 423949

calendar_today

Updated On:

Products

DX SaaS

Issue/Introduction

Occasionally we get the following error from AIOPs server or Connection Resets while registering or pushing metric. We wanted to know the root cause of these.

{error:{code:0,message:GENERIC_SERVICE_ERROR,traceId:xxxxxx}} Some sample traces - xxxxxxxxx
 

Environment

DX SAAS 25.11.1

Cause

For the /nass/metricValue/store apmservice call errors, the logs captured the exception call stacks similar to the one below:

 2025-12-16T04:33:01.880Z ERROR 1 --- [nass] [xxxx-xx] c.c.a.c.rest.ServiceExceptionHandler  
 : https://xxxxx/nass/metricValue/store, 500,0,6fa5ef35d56aeb24: GENERIC_SERVICE_ERROR, java.util.concurrent.CancellationException

com.ca.apm.common.api.ServicesException: 500,0,6fa5ef35d56aeb24: GENERIC_SERVICE_ERROR, java.util.concurrent.CancellationException

Such errors may indicate the nass pod being overwhelmed and triggering cancellation of concurrent tasks.  Looking up the health and performance metrics for this particular nass instance, they didn't appear to show abnormal patterns of critical performance issues around that error time period.  Thus, this error hiccup could be momentarily and the nass pod appeared to recover subsequently.


As for their metadata/registerMetric apmservice call errors, the logs captured the exception call stacks similar to the one below:

 
2025-12-10T23:08:10.591Z ERROR 1 --- [metadata] [xxxxx] c.c.a.c.rest.ServiceExceptionHandler    
 : https://xxxxxxx/metadata/registerMetric, 500,0,879133300c541546: GENERIC_SERVICE_ERROR, reactor.netty.channel.AbortedException: 
 Connection has been closed

com.ca.apm.common.api.ServicesException: 500,0,879133300c541546: GENERIC_SERVICE_ERROR, reactor.netty.channel.AbortedException: Connection has been closed

This error may indicate the metadata pod taking too long to process the registerMetric calls and connections being closed 
or likely timed out prior to completing the responses.  Again, this error hiccup could be momentarily and the particular metadata pod appeared to recover as well.

Resolution

If these errors have occurred only sporadically,you may also consider improving your web service implementation to handle these apmservice api call exceptions accordingly, e.g. limiting number of concurrent calls/submissions and/or retrying the same calls/submissions later upon having such errors, etc.