/common/logs/admin/app.log, below logs are observed.2024-10-04 03:12:15.873 UTC [RemotingService_SvcThread-3, Ent: HybridityAdmin, Usr: HybridityAdmin, , TxId: ###########-#####-#####-#####-############] WARN c.v.v.h.m.k.KafkaProducerDelegate- Publish failed and will retry
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.RecordTooLargeException: The message is 5363784 bytes when serialized which is larger than 2097152, which is the value of the max.request.size configuration.
at org.apache.kafka.clients.producer.KafkaProducer$FutureFailure.<init>(KafkaProducer.java:1316)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:985)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:885)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:773)
at com.vmware.vchs.hybridity.messaging.kafka.KafkaProducerDelegate.sendMessageWithRetries(KafkaProducerDelegate.java:214)
at com.vmware.vchs.hybridity.messaging.kafka.KafkaProducerDelegate.publishMessageWithTransaction(KafkaProducerDelegate.java:191)
at com.vmware.vchs.hybridity.messaging.kafka.KafkaProducerDelegate.publish(KafkaProducerDelegate.java:155)
at com.vmware.vchs.hybridity.messaging.kafka.KafkaProducerDelegate.publish(KafkaProducerDelegate.java:149)
at com.vmware.vchs.hybridity.messaging.adapter.JobManagerJobPublisher.publish(JobManagerJobPublisher.java:112)
at com.vmware.vchs.hybridity.messaging.adapter.JobManager.queueJob(JobManager.java:1688)
at com.vmware.vchs.hybridity.service.remoting.jobs.JobStatusPollAndNotify.handleJobsFromNewVersion(JobStatusPollAndNotify.java:695)
at com.vmware.vchs.hybridity.service.remoting.jobs.JobStatusPollAndNotify.retrieveUpdatesFromRemoteSinceLastRequest(JobStatusPollAndNotify.java:571)
stdout F Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The message is ####### bytes when serialized which is larger than #######, which is the value of the max.request.size configuration.
2.3
3.2
This happens as the topmost record on "RemotingOutbox" on TCA-CP wasn't getting consumed by TCA-M because of the kafka limit of 2MB and the record was more than 5MB. The subsequent updates after that were stuck.
Follow the below steps to delete the job:
connect-to-postgres
>> SELECT val->'job'->>'jobType', "creationDate", "lastUpdated" FROM "RemotingOutbox" ORDER BY "lastUpdated";
>> DELETE FROM "RemotingOutbox" WHERE val->'job'->>'jobType'='<JobType returned in above query>';
To figure out the issue SSH to the target TCA-CP and check the postgres.
>> SELECT val->'job'->>'jobType', "creationDate", "lastUpdated" FROM "RemotingOutbox" ORDER BY "lastUpdated";
and check the entries: .
>> SELECT count(*) FROM "RemotingOutbox" ;
Returning around 700+ records means that the remote Jobs aren't getting updated on TCA-M.