Disconnect message similar to the following is found in the in the Smarts Alcatel 5620 SAM Adapter (ASAM) log file:
[June 21, 2012 12:23:21 AM GMT+01:00 +004ms] t@112 SM_SocketObserver-5 8575 InCharge-AM-PM Remote Accessor #1
CI-E-EWHILE-While executing function "readableNow"
CI-EFLOWID-For flow CI_FlowBufferedHead_U [observer for client 5 8575 InCharge-AM-PM Remote Accessor] HEAD|BUFFERED @0xffffffff0ad06580
. Read buffer, 0 bytes available of 2145
. ?3?2A2A2A2A2A2A2A2A 2A2A2A2A2A2A2A2A ^|2A00000000000000 000000002A2A2A2A
. Write buffer, 0 bytes written of 2048
. ?3?[^0000000100000040 4FE25B6900000009 534D5F5379737465 6D00000009534D2D
. ->CI_FlowAES_CBC_U [observer for client 5 8575 InCharge-AM-PM Remote Accessor] IN_FLOW|BLOCK @0xffffffff12ea8270
. ->CI_FlowTCP_U [observer for client 5 8575 InCharge-AM-PM Remote Accessor] IN_FLOW|PHYSICAL @0xfffffffdf4618760
. *:v4:31300 KS N/A, KR N/A
. Open fd=104, conn June 19, 2012 6:56:07 PM GMT+01:00, disc N/A,
. 192.168.0.98:31300 -> 192.168.0.104:45935, tmo 9344 09:18:01 N/S 1/1
CI-EWHILEREAD-After reading "0" bytes of "15" maximum
<SYS>-ECONNRESET-Connection reset by peer; in file "/work/blackcurrent/DMT-9.0.0.X/1330/smarts/clsapi/ci_flow.c" at line 2503"<SYS>-ECONNRESET-Connection reset by peer" - indicated the connection was closed by the remote peer i.e the IP-AM-PM domain.
There are two connections made by the remote repository accessor from IP-AM/PM to ASAM. One is a regular two-way command/response connection, and the other is a logical one-way connection used to push subscription alerts to the IP-AM/PM. The "timeout" parameter applies to the two-way command/response connection, which is used to make certain remote-api calls. Timeouts are required because one end of the connection needs to know if the other end has somehow gone into a deadlock or unresponsive state. But, if the remote server is busy, possibly with discovery/reconfigure/post-processing or other operations where there can be a lot of contention for the repository lock, a higher timeout value is needed. Even though the two channels are separate, if one channel experiences an error (timeout), both connections get closed and reconnected, so increasing the timeout may be necessary. This requirement is more likely in an environment that has multiple IP-AM/PM domains subscribing to a single ASAM domain. Note that this timeout issue will not affect the "logical" subscriptions channel. Any status changes at ASAM will be instantaneously sent to the AM.
Card:
1. sm_adpater -s <domain> -b <broker> --subscribe=Card::.*::.*/pae > <domain>-CardAlarmlist.log
2. sm_adapter -s <domain> -b <broker> --subscribeProp=Card::.*::Status > CardStatuslist.log
3. sm_adapter -s <domain> -b <broker> --subscribeProp=<Instrumentation Class Name>::.*::StatusFromPoll > <domain>-StatusFromPoll.log
4. sm_adapter -s <domain> -b <broker> --subscribeProp=<Instrumentation Class Name>::.*::StatusIsCriticalActive > <domain>-StatusIsCriticalActive.log
Note: sm_adapter subscription to "StatusFromPoll" and "StatusIsCriticalActive" attributes require the name of a the card Instrumentation class.
We first need to find out the Instrumentation class of the Cards that are producing the false alarms.
You can get this using:
dmctl -s <Domain Name:NGNIP-APM2> -b <broker> get Card::<any Card instance having issue>::InstrumentedBy
this will return { <Instrumentation Class Name>::<Instrumentation Class Instance name>}
Interface:
1. sm_adpater -s <domain> -b <broker> --subscribe=Interface::.*::.*/pae > <domain>-AlarmlistInterface.log
2. sm_adapter -s <domain> -b <broker> --subscribeProp=Interface::.*::Status > <domain>-StatusInterface.log
3. sm_adapter -s <domain> -b <broker> --subscribeProp=Interface::.*::OperStatus > <domain>-OperStatusInterface.log
Let these subscriptions run until an occurance of the bulk Card/Interface Down alarms is observed.
The IP and ASAM adapter subscriptions were compared for the devices that re-notified.
The IP subscriptions show the Card/Interface Status changes from DOWN to UNKNOWN then back to DOWN wheras the ASAM subscriptions show the Card/Interface status stay constantly DOWN.