search cancel

cohesity_monitor crashes and failed logins

book

Article ID: 254582

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

cohesity_monitor is probe version is 1.00hf1. From a cold start, it will login to our api endpoint cohesityp.xxxx.com just fine.  It will continue gathering metrics, etc. for several days in a row but once or twice a week it will generate an alarm "cohesityp.xxxxx.com is not responding (reason: Connection and/or inventory update failure."

Now, if I restart the probe, it will login successfully again and chug along for 2 or 3 days before it alarms again.  Even when it is in alarm, i can login to cohesityp.xxxxx.com manually with no problems.

Sep 21 06:52:04:506 [Data Collector - TEST Cohesity, cohesity_monitor] Unexpected response from url /public/alerts?alertCategoryList=kDisk,kNode,kCluster,kNodeHealth,kClusterHealth,kBackupRestore,kEncryption,kArchivalRestore,kRemoteReplication,kQuota,kLicense,kHeliosProActiveWellness,kHeliosAnalyticsJobs,kHeliosSignatureJobs,kSecurity&maxAlerts=1000&alertSeverityList=kCritical,kWarning&startDateUsecs=1663152724004000&endDateUsecs=1663757524004000.
 Status code is 400
 Response is HttpResponseProxy{HTTP/1.1 400 Bad Request [Cache-Control: no-cache, no-store, must-revalidate, Content-Type: application/json, Pragma: no-cache, Referrer-Policy: strict-origin-when-cross-origin, Strict-Transport-Security: max-age=31536000; includeSubDomains, Vary: Accept-Encoding, X-Content-Type-Options: nosniff, X-Frame-Options: SAMEORIGIN, X-Ratelimit-Limit: 10000, X-Ratelimit-Remaining: 9999, X-Ratelimit-Reset: 1663757525, X-Xss-Protection: 1; mode=block, Date: Wed, 21 Sep 2022 10:52:04 GMT] [email protected]


Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] PERF: DONE:  cohesity_monitor inventory update for TEST Cohesity   {Seconds=9.393}
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] Inventory update failed. Sending resource alarm
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] (1) error, Inventory update failed for TEST Cohesity: Attempting to add vertex that already exists.id: '/XXXXTdom/TER_hub_UTIL_2453/Admin_test2_robot/cohesity_monitor:idsProbe:TEST Cohesity::CohesityP::139::sotrageDomainsFolderTEST Cohesity::CohesityP::139:.CtdDisplayAttribute'
 at com.nimsoft.probe.framework.genprobe.GenInventory.handleInventoryUpdateException(GenInventory.java:202)
 at com.nimsoft.probe.framework.genprobe.GenInventory.update(GenInventory.java:179)
 at com.nimsoft.probe.framework.genprobe.GenSession.updateInventory(GenSession.java:145)
 at com.nimsoft.probe.common.ctd.CtdSession.doInventoryUpdate(CtdSession.java:257)
 at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.executeSessionUpdateInventory(CohesityDataCollector.java:70)
 at com.nimsoft.probe.common.ctd.ProbeDataCollector.execute(ProbeDataCollector.java:215)
 at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.execute(CohesityDataCollector.java:59)
 at com.nimsoft.vm.collector.DataCollector.run(DataCollector.java:151)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Attempting to add vertex that already exists.id: '/XXXXTdom/TER_hub_UTIL_2453/Admin_test2_robot/cohesity_monitor:idsProbe:TEST Cohesity::CohesityP::139::sotrageDomainsFolderTEST Cohesity::CohesityP::139:.CtdDisplayAttribute'
 at tgraph2.TGraph2.addVertex(TGraph2.java:43)
 at com.nimsoft.ids.ctd.graph.CtdGraphBase.addVertex(CtdGraphBase.java:131)
 at com.nimsoft.ids.ctd.graph.CtdGraphProbeConfiguration.addDisplayAttributes(CtdGraphProbeConfiguration.java:475)
 at com.nimsoft.probe.common.ctd.gctx.GctxCtdEntity.setCollapsed(GctxCtdEntity.java:352)
 at com.nimsoft.probe.framework.genprobe.GenInventory.addNode(GenInventory.java:391)
 at com.nimsoft.probe.framework.genprobe.GenInventory.addItem(GenInventory.java:323)
 at com.nimsoft.probe.framework.genprobe.GenInventory.update(GenInventory.java:138)
 ... 13 more

Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] for remote probe, getResourceAlarm is using Resource Generated Device Id: b1737e03b34e48766f946e03af96eb0f
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] for remote probe, getResourceAlarm is using Resource Generated Device Id: b1737e03b34e48766f946e03af96eb0f
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] PERF: DONE:  ----- Pass-478 --- Data Collector Check Interval for: TEST Cohesity   {Seconds=9.393}
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] PERF: Memory Status: Max=3641   CurrentAllocated=2028   Free=1332   Used=696 MB
Sep 21 06:52:13:396 [Data Collector - TEST Cohesity, cohesity_monitor] DataCollector for TEST Cohesity finished collecting data
Sep 21 06:52:14:407 [BulkSender Monitor, cohesity_monitor] Bulk send 2 alarms
Sep 21 06:52:14:407 [BulkSender Monitor, cohesity_monitor] Sent NimAlarm C, severity=5, message==cohesityp.xxxxx.com is not responding (reason: Connection and/or inventory update failure), subsystem=2.7.1, suppressionid=TEST Cohesity, source=Admin_test2_robot and received confirmation id RE02939654-15554
Sep 21 06:52:14:407 [BulkSender Monitor, cohesity_monitor] Sent NimAlarm D, severity=5, message==cohesityp.xxxxx.com is not responding (reason: Connection and/or inventory update failure), subsystem=2.7.1, suppressionid=TEST Cohesity, source=cohesityp.xxxxx.com and received confirmation id RE02939654-15556

Sep 21 02:22:03:719 [Data Collector - TEST Cohesity, cohesity_monitor] null
Sep 21 02:22:03:719 [Data Collector - TEST Cohesity, cohesity_monitor] java.lang.NullPointerException
 at java.util.Objects.requireNonNull(Objects.java:203)
 at com.nimsoft.probe.storage.cohesity_monitor.inventory.data.provider.InventoryDataProviderImpl.getData(InventoryDataProviderImpl.java:55)
 at com.nimsoft.probe.storage.cohesity_monitor.inventory.InventoryServiceImpl.getInventory(InventoryServiceImpl.java:28)
 at com.nimsoft.probe.storage.cohesity_monitor.ProbeMain.getUpdatedInventory(ProbeMain.java:338)
 at com.nimsoft.probe.framework.genprobe.GenInventory.update(GenInventory.java:117)
 at com.nimsoft.probe.framework.genprobe.GenSession.updateInventory(GenSession.java:145)
 at com.nimsoft.probe.common.ctd.CtdSession.doInventoryUpdate(CtdSession.java:257)
 at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.executeSessionUpdateInventory(CohesityDataCollector.java:70)
 at com.nimsoft.probe.common.ctd.ProbeDataCollector.execute(ProbeDataCollector.java:215)
 at com.nimsoft.probe.storage.cohesity_monitor.graph.CohesityDataCollector.execute(CohesityDataCollector.java:59)
 at com.nimsoft.vm.collector.DataCollector.run(DataCollector.java:151)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)

Environment

  • Release: 20.4
  • cohesity_monitor v1.0HF1

Resolution

  • cohesity_monitor_1_01_T3 is attached to this KB Article.
  • Download it, add to the Primary hub's local archive and then deploy it.

Attachments

cohesity_monitor_1_01_T3 (1)_1668722983825.zip get_app