Symptoms:
"Apply Changes" on Operations (Ops) Manager triggers an update on instances deployed by the BOSH. Sometimes the update fails with errors as shown in the example below.
Task 13542 | 01:26:49 | Updating instance uaa: uaa/a3106cd2-552e-4f0a-974e-be4ac5cf87a6 (0) (canary) (00:03:01) L Error: Timed out sending 'upload_blob' to c9632ea6-c9fd-409f-ae91-426d3c094406 after 45 seconds Task 13542 | 01:29:50 | Error: Timed out sending 'upload_blob' to c9632ea6-c9fd-409f-ae91-426d3c094406 after 45 seconds
Debug logs of the BOSH task contains more information about the error.
D, [2018-08-03T01:27:35.031615 #99316] [canary_update(uaa/a3106cd2-552e-4f0a-974e-be4ac5cf87a6 (0))] DEBUG -- DirectorJobRunner: SENT: agent.c9632ea6-c9fd-409f-ae91-426d3c094406 {"protocol":3,"method":"upload_b lob","arguments":[{"blob_id":"68a1c780-00a4-45e3-aee7-feb0d742d334","checksum":"<redacted>","payload":"<redacted>"}],"reply_to":"director.5deeea10-9fa5-46b1-b562-18d3feae7543.c9632ea6-c9fd-409f-ae91-426d3c09440 6.0e71f50a-9613-4f87-83f4-704d790f3caa"} E, [2018-08-03T01:27:35.037022 #99316] [] ERROR -- DirectorJobRunner: NATS client error: 'Maximum Payload Violation'
And a similar error could also be found in the logs of NATS job on the BOSH Director.
1] 2018/08/03 01:39:45.576776 [ERR] <IP Address>:55880 - cid:786 - Maximum Payload Exceeded: 1328086 vs 1048576 [1] 2018/08/03 01:40:30.592941 [ERR] <IP Address>:34122 - cid:796 - Maximum Payload Exceeded: 1328086 vs 1048576
logtime: true # maximum payload max_payload: 15728645) Restart the NATS job
monit restart nats