cohesity_monitor is probe version is 1.00hf1. From a cold start, it will login to our api endpoint cohesityp.xxxx.com just fine. It will continue gathering metrics, etc. for several days in a row but once or twice a week it will generate an alarm "cohesityp.xxxxx.com is not responding (reason: Connection and/or inventory update failure."
Now, if I restart the probe, it will login successfully again and chug along for 2 or 3 days before it alarms again. Even when it is in alarm, i can login to cohesityp.xxxxx.com manually with no problems.
Sep 21 06:52:04:506 [Data Collector - TEST Cohesity, cohesity_monitor] Unexpected response from url /public/alerts?alertCategoryList=kDisk,kNode,kCluster,kNodeHealth,kClusterHealth,kBackupRestore,kEncryption,kArchivalRestore,kRemoteReplication,kQuota,kLicense,kHeliosProActiveWellness,kHeliosAnalyticsJobs,kHeliosSignatureJobs,kSecurity&maxAlerts=1000&alertSeverityList=kCritical,kWarning&startDateUsecs=1663152724004000&endDateUsecs=1663757524004000.
Status code is 400
Response is HttpResponseProxy{HTTP/1.1 400 Bad Request [Cache-Control: no-cache, no-store, must-revalidate, Content-Type: application/json, Pragma: no-cache, Referrer-Policy: strict-origin-when-cross-origin, Strict-Transport-Security: max-age=31536000; includeSubDomains, Vary: Accept-Encoding, X-Content-Type-Options: nosniff, X-Frame-Options: SAMEORIGIN, X-Ratelimit-Limit: 10000, X-Ratelimit-Remaining: 9999, X-Ratelimit-Reset: 1663757525, X-Xss-Protection: 1; mode=block, Date: Wed, 21 Sep 2022 10:52:04 GMT] org.apache.http.client.entity.DecompressingEntity@2f4a01dc}
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] PERF: DONE: cohesity_monitor inventory update for TEST Cohesity {Seconds=9.393}
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] Inventory update failed. Sending resource alarm
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] (1) error, Inventory update failed for TEST Cohesity: Attempting to add vertex that already exists.id: '/XXXXTdom/TER_hub_UTIL_2453/Admin_test2_robot/cohesity_monitor:idsProbe:TEST Cohesity::CohesityP::139::sotrageDomainsFolderTEST Cohesity::CohesityP::139:.CtdDisplayAttribute'
at com.nimsoft.probe.framework.genprobe.GenInventory.handleInventoryUpdateException(GenInventory.java:202)
at com.nimsoft.probe.framework.genprobe.GenInventory.update(GenInventory.java:179)
at com.nimsoft.probe.framework.genprobe.GenSession.updateInventory(GenSession.java:145)
at com.nimsoft.probe.common.ctd.CtdSession.doInventoryUpdate(CtdSession.java:257)
at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.executeSessionUpdateInventory(CohesityDataCollector.java:70)
at com.nimsoft.probe.common.ctd.ProbeDataCollector.execute(ProbeDataCollector.java:215)
at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.execute(CohesityDataCollector.java:59)
at com.nimsoft.vm.collector.DataCollector.run(DataCollector.java:151)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Attempting to add vertex that already exists.id: '/XXXXTdom/TER_hub_UTIL_2453/Admin_test2_robot/cohesity_monitor:idsProbe:TEST Cohesity::CohesityP::139::sotrageDomainsFolderTEST Cohesity::CohesityP::139:.CtdDisplayAttribute'
at tgraph2.TGraph2.addVertex(TGraph2.java:43)
at com.nimsoft.ids.ctd.graph.CtdGraphBase.addVertex(CtdGraphBase.java:131)
at com.nimsoft.ids.ctd.graph.CtdGraphProbeConfiguration.addDisplayAttributes(CtdGraphProbeConfiguration.java:475)
at com.nimsoft.probe.common.ctd.gctx.GctxCtdEntity.setCollapsed(GctxCtdEntity.java:352)
at com.nimsoft.probe.framework.genprobe.GenInventory.addNode(GenInventory.java:391)
at com.nimsoft.probe.framework.genprobe.GenInventory.addItem(GenInventory.java:323)
at com.nimsoft.probe.framework.genprobe.GenInventory.update(GenInventory.java:138)
... 13 more
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] for remote probe, getResourceAlarm is using Resource Generated Device Id: b1737e03b34e48766f946e03af96eb0f
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] for remote probe, getResourceAlarm is using Resource Generated Device Id: b1737e03b34e48766f946e03af96eb0f
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] PERF: DONE: ----- Pass-478 --- Data Collector Check Interval for: TEST Cohesity {Seconds=9.393}
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] PERF: Memory Status: Max=3641 CurrentAllocated=2028 Free=1332 Used=696 MB
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] DataCollector for TEST Cohesity finished collecting data
Sep 21 06:52:14:407 [BulkSender Monitor, cohesity_monitor] Bulk send 2 alarms
Sep 21 06:52:14:407 [BulkSender Monitor, cohesity_monitor] Sent NimAlarm C, severity=5, message==cohesityp.xxxxx.com is not responding (reason: Connection and/or inventory update failure), subsystem=2.7.1, suppressionid=TEST Cohesity, source=Admin_test2_robot and received confirmation id RE02939654-15554
Sep 21 06:52:14:407 [BulkSender Monitor, cohesity_monitor] Sent NimAlarm D, severity=5, message==cohesityp.xxxxx.com is not responding (reason: Connection and/or inventory update failure), subsystem=2.7.1, suppressionid=TEST Cohesity, source=cohesityp.xxxxx.com and received confirmation id RE02939654-15556
Sep 21 02:22:03:719 [Data Collector - TEST Cohesity, cohesity_monitor] null
Sep 21 02:22:03:719 [Data Collector - TEST Cohesity, cohesity_monitor] java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at com.nimsoft.probe.storage.cohesity_monitor.inventory.data.provider.InventoryDataProviderImpl.getData(InventoryDataProviderImpl.java:55)
at com.nimsoft.probe.storage.cohesity_monitor.inventory.InventoryServiceImpl.getInventory(InventoryServiceImpl.java:28)
at com.nimsoft.probe.storage.cohesity_monitor.ProbeMain.getUpdatedInventory(ProbeMain.java:338)
at com.nimsoft.probe.framework.genprobe.GenInventory.update(GenInventory.java:117)
at com.nimsoft.probe.framework.genprobe.GenSession.updateInventory(GenSession.java:145)
at com.nimsoft.probe.common.ctd.CtdSession.doInventoryUpdate(CtdSession.java:257)
at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.executeSessionUpdateInventory(CohesityDataCollector.java:70)
at com.nimsoft.probe.common.ctd.ProbeDataCollector.execute(ProbeDataCollector.java:215)
at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.execute(CohesityDataCollector.java:59)
at com.nimsoft.vm.collector.DataCollector.run(DataCollector.java:151)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)