Removing device from maintenance schedule in USM throws error

book

Article ID: 141667

calendar_today

Updated On:

Products

NIMSOFT PROBES DX Infrastructure Management

Issue/Introduction

 

every time we try to remove a device from a maintenance schedule in USM, we get the following error. 

We checked the maintenance_mode logfile and the problem seems to be related to EMS. However, EMS seems to be working fine and restarting EMS does nothing to fix the problem. 

We then restarted the maintenance_mode probe and still have the same problem.

Please advise. 

Here is the error from USM. You can also find it in the maintenance_mode log 

An unknown error has occurred.
Refreshing your browser may resolve the issue.

Details:
com.firehunter.ump.exceptions.DataFactoryException : I/O error on nim session (C) com.nimsoft.nimbus.NimNamedClientSession(Socket[addr=/xxxxxxx,port=48039,localport=64650])
Please check the log for more information.
Stack Trace:
(2) communication error, I/O error on nim session (C) com.nimsoft.nimbus.NimNamedClientSession(Socket[addr=/xxxxxxx,port=48039,localport=64650]): Read timed out
 at com.nimsoft.nimbus.NimSessionBase.recv(NimSessionBase.java:930)
 at com.nimsoft.nimbus.NimSessionBase.sendRcv(NimSessionBase.java:578)
 at com.nimsoft.nimbus.NimSessionBase.sendRcv(NimSessionBase.java:561)
 at com.nimsoft.nimbus.NimClientSession.send(NimClientSession.java:171)
 at com.nimsoft.nimbus.NimRequest.sendImpersonate(NimRequest.java:264)
 at com.nimsoft.nimbus.pool.NimRequestPool.sendImpersonate(NimRequestPool.java:92)
 at com.nimsoft.nimbus.pool.NimRequestPool.sendImpersonate(NimRequestPool.java:74)
 at com.nimsoft.nimbus.pool.NimRequestPool.send(NimRequestPool.java:66)
 at com.nimsoft.nimbus.pool.NimRequestPoolInstance.send(NimRequestPoolInstance.java:191)
 at com.firehunter.umpportlet.PDSUtils.send(PDSUtils.java:122)
 at com.firehunter.usm.Maintenance.removeMaintenanceSystems(Maintenance.java:318)
 at com.firehunter.usm.DataFactory.removeMaintenanceSystems(DataFactory.java:8290)
 at com.firehunter.usm.DataFactory.removeMaintenanceSystems(DataFactory.java:8282)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at flex.messaging.services.remoting.adapters.JavaAdapter.invoke(JavaAdapter.java:418)
 at flex.messaging.services.RemotingService.serviceMessage(RemotingService.java:183)
 at flex.messaging.MessageBroker.routeMessageToService(MessageBroker.java:1400)
 at flex.messaging.endpoints.AbstractEndpoint.serviceMessage(AbstractEndpoint.java:1005)
 at flex.messaging.endpoints.amf.MessageBrokerFilter.invoke(MessageBrokerFilter.java:103)
 at flex.messaging.endpoints.amf.LegacyFilter.invoke(LegacyFilter.java:158)
 at flex.messaging.endpoints.amf.SessionFilter.invoke(SessionFilter.java:44)
 at flex.messaging.endpoints.amf.BatchProcessFilter.invoke(BatchProcessFilter.java:67)
 at flex.messaging.endpoints.amf.SerializationFilter.invoke(SerializationFilter.java:166)
 at flex.messaging.endpoints.BaseHTTPEndpoint.service(BaseHTTPEndpoint.java:291)
 at flex.messaging.MessageBrokerServlet.service(MessageBrokerServlet.java:353)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
 at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
 at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
 at com.liferay.portal.kernel.servlet.filters.invoker.ResponseHeaderFilter.doFilter(ResponseHeaderFilter.java:31)
 at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
 at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
 at com.firehunter.ump.auth.InvalidHttpSessionFilter.doFilter(InvalidHttpSessionFilter.java:29)
 at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
 at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.doFilter(InvokerFilterChain.java:73)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.doFilter(InvokerFilterChain.java:117)
 at sun.reflect.GeneratedMethodAccessor621.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at com.liferay.portal.kernel.bean.ClassLoaderBeanHandler.invoke(ClassLoaderBeanHandler.java:67)
 at com.sun.proxy.$Proxy1154.doFilter(Unknown Source)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.doFilter(InvokerFilterChain.java:73)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.processDirectCallFilter(InvokerFilterChain.java:168)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.doFilter(InvokerFilterChain.java:96)
 at com.liferay.portal.kernel.servlet.PortalClassLoaderFilter.doFilter(PortalClassLoaderFilter.java:74)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.processDoFilter(InvokerFilterChain.java:207)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.doFilter(InvokerFilterChain.java:109)
 at com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilter.doFilter(InvokerFilter.java:108)
 at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
 at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
 at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:200)
 at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
 at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:490)
 at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
 at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
 at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
 at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:408)
 at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
 at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:834)
 at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1415)
 at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
 at java.net.SocketInputStream.read(SocketInputStream.java:171)
 at java.net.SocketInputStream.read(SocketInputStream.java:141)
 at java.net.SocketInputStream.read(SocketInputStream.java:224)
 at com.nimsoft.nimbus.NimSessionBase.readNimbusHeader(NimSessionBase.java:1063)
 at com.nimsoft.nimbus.NimSessionBase.recv(NimSessionBase.java:869)
 ... 68 more

Cause

performance delay in maintenance_mode probe

Environment

Release : 9.2.0

Component : UIM MAINTENANCE MODE

Resolution

The configuration of all wasp probes should be changed adding this key:
/setup/maintenance_timeout=60000

This will increase the time UMP waits for a reply from the maintenance_mode probe.

Engineering still has to comment on whether this setting only affects USM or also UIMAPI.

Also, the recommendation was to change maintenance_mode.cfg as follows:

/setup/purge_maintenance_windows = 1

Currently set to 24. Reducing the number of maintenance windows will reduce the time significantly.



After making these changes in addition it was found out that the most likely cause for all issues seems to be the loglevel itself. maintenance_mode probe writes a lot to the logfiles.