Running netstat -anp on the impacted Data Collectors show four connections to the Data Aggregator as it should.
However stands out is the following for some connections:
tcp6 0 <NONZERONUMBERTHATDOESNOTDECREMENT> <IPofDC>:<PORT> <IPofDA>:<PORT> ESTABLISHED <PIDofActiveMQ>/java
The "Recv-Q" and "Send-Q" columns tell us how much data is in the queue for that socket, waiting to be read (Recv-Q) or sent (Send-Q).
The send queue has a <NONZERONUMBERTHATDOESNOTDECREMENT> indicating an issue sending the data from that port, and it is likely not reaching the DA.
This indicates network issues between the DC and DA.
The network is dropping or blocking some (but not all) of our traffic, so the ACKs are not getting back to the DC , so the DC waits and the prefetch bucket gets filled up and ActiveMQ starts caching data as it cannot send it to the Data Aggregator