When two or more WCC nodes point to the same database, they become members of a cluster to share collection tasks and data. This clustering is enabled by Hazelcast module that WCC is embedded with.
If proper connectivity between the cluster nodes does not exist, the cluster is not formed properly and the side effect of this is that the same collecor tasks (JOB, MACHINE, ALARM collections) are run on both WCC nodes, potentially leading to locking issues. A repeated select query against WCC database table CFG_COLLECTOR_TASKS, would show the tasks constantly switching from one collector node to the other. Normally it would not switch that often.
Workload Automation AutoSys
Collector port (default port is 7004) is blocked or not available for connectivity between the nodes.
Consider a setup with two nodes - wcc-node1.example.com (IP address: XX:XX:XX:XX) and wcc-node2.example.com (IP address: YY:YY:YY:YY). wcc-node1 was already started. wcc-node2 is being started where it should connect to wcc-node1's collector and establish the cluster.
In the CA-wcc.log on wcc-node2, only one member shows up:
INFO | jvm 1 | 2024/08/21 12:27:59 | 19 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(96) Lets give 5 seconds for <default-collector> to get added, instancename : wcc-node2.example.com
INFO | jvm 1 | 2024/08/21 12:28:04 | 24 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(98) Wait complete, check for <default-collector> node existance
INFO | jvm 1 | 2024/08/21 12:28:04 | 24 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(108) Check whether <default-collector> node exists - node wcc-node2.example.com, clusterport 7004
INFO | jvm 1 | 2024/08/21 12:28:04 | 24 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(111) is collector node exists with host wcc-node2.example.com and port 7004? true
INFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientInvocationService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Running with 2 response threads, dynamic=true
INFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #LifecycleService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] HazelcastClient 4.2.1 (20210630 - 06a4018) is STARTING
INFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #LifecycleService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] HazelcastClient 4.2.1 (20210630 - 06a4018) is STARTED
INFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Trying to connect to cluster: EMDXWpbvpo
INFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Trying to connect to [wcc-node1.example.com]:7004
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Exception during initial connection to [wcc-node1.example.com]:7004: com.hazelcast.core.HazelcastException: java.io.IOException: Connection timed out: no further information to address wcc-node1.example.com/XX.XX.XX.XX:7004
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Trying to connect to [wcc-node2.example.com]:7004
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #LifecycleService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] HazelcastClient 4.2.1 (20210630 - 06a4018) is CLIENT_CONNECTED
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Authenticated with server [wcc-node2.example.com]:7004:5749c579-92a9-43ab-9a4f-7dd22d645e89, server version: 4.2.1, local address: /YY.YY.YY.YY:49810
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #Diagnostics #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <hz.client_1.event-3> [[]] INFO #ClientClusterService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1]
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 |
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | Members [1] {
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | Member [wcc-node2.example.com]:7004 - 5749c579-92a9-43ab-9a4f-7dd22d645e89
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | }
INFO | jvm 1 | 2024/08/21 12:28:27 | 44 |
INFO | jvm 1 | 2024/08/21 12:28:28 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientStatisticsService #(69) Client statistics is enabled with period 5 seconds.
INFO | jvm 1 | 2024/08/21 12:28:28 | 45 | Aug 21, 2024 12:28:28 PM com.ca.wcc.access.resources.BaseAccessProviderFactory initAccessConfig
INFO | jvm 1 | 2024/08/21 12:28:28 | 45 | INFO: Loading EEM configuration... eemAppPropertySuffix='.autosysApp'
INFO | jvm 1 | 2024/08/21 12:28:28 | 45 | Aug 21, 2024 12:28:28 PM com.ca.wcc.access
Ensure all WCC nodes can be resolved over network, and the collector/cluster port (default is 7004) is not blocked for communication between the WCC nodes. Restart WCC on all nodes for the change to be effective.