Introscope Enterprise Manager - MOM Hot Failover does not work on NAS.
search cancel

Introscope Enterprise Manager - MOM Hot Failover does not work on NAS.

book

Article ID: 41656

calendar_today

Updated On:

Products

CA Application Performance Management Agent (APM / Wily / Introscope) INTROSCOPE

Issue/Introduction

 Problem:

 MOM Hot Failover does not work. And it fails with the below error on the Primary EM IntroscopeEnterpriseManager log:

 [ERROR] [Smartstor/Superstor Spool] [Manager] Can't write timeslice ######## to spool: java.io.SyncFailedException: sync failed

 

 Environment:

 This issue happened on APM 9.7.1.16 but can happen on other versions as the issue is on the OS (UNIX/Linux) side.

 Primary and Secondary Introscope installation are on shared disk using a Network Attached Storage (NAS) protocol such as these protocols:

  • Network File System (NFS)
  • Server Message Block (SMB)

 

 Cause:

 The Lock File feature is not turned on the NFS system.

 The Smartstor Data directory is installed and configured to run on an NFS shared drive.

 While starting the Enterprise Managers on the Primary and Secondary, the "Primary Lock file" is acquired by both MOM instances, which is causing the problem. 

 Here are the EM Logging detail :

 Primary MOM IntroscopeEnterpriseManagerlog:


[INFO] [main] [Manager.HotFailover] The Introscope Enterprise Manager is configured as a Primary EM
[INFO] [main] [Manager.HotFailover] Acquiring secondary lock...
[INFO] [main] [Manager.HotFailover] Acquired secondary lock
[INFO] [main] [Manager.HotFailover] Acquiring primary lock...
[INFO] [main] [Manager.HotFailover] Acquired primary lock
[INFO] [main] [Manager.HotFailover] Released secondary lock
[INFO] [main] [Manager.HotFailover] Proceeding with startup

 Secondary MOM IntroscopeEnterpriseManager.log:


[INFO] [main] [Manager.HotFailover] The Introscope Enterprise Manager is configured as a Secondary EM
[INFO] [main] [Manager.HotFailover] Acquiring primary lock...
[INFO] [main] [Manager.HotFailover] Acquired primary lock
[INFO] [main] [Manager.HotFailover] Trying to acquire secondary lock
[INFO] [main] [Manager.HotFailover] Acquired secondary lock
[INFO] [main] [Manager.HotFailover] Released secondary lock
[INFO] [main] [Manager.HotFailover] Proceeding with startup

 What's wrong is after Primary MOM acquired the primary lock, the Secondary MOM should be blocked when trying to acquire the Primary lock again.

 Instead, the Secondary MOM also acquired the Primary lock and proceeded with startup.


The Primary and Secondary lock are simply two file locks. These two files are "primary_em.lck" and "secondary_em.lck" under <EM_Home>\config\internal\server.


 Because the NFS failed to lock the file, both MOM instances acquired Primary lock file and started as Primary MOM.

 

 Resolution:

 Enable the Lock File feature on the NFS which is handled by the OS.

 

Environment

Release: CEMUGD00200-9.7-Introscope to CA Application-Performance Management-Upgrade Main
Component: