When two or more WCC nodes point to the same database, they become members of a cluster to share collection tasks and data. This clustering is enabled by Hazelcast module that WCC is embedded with.
If proper connectivity between the cluster nodes does not exist, the cluster is not formed properly and the side effect of this is that the same collecor tasks (JOB, MACHINE, ALARM collections) are run on both WCC nodes, potentially leading to locking issues. A repeated select query against WCC database table CFG_COLLECTOR_TASKS, would show the tasks constantly switching from one collector node to the other. Normally it would not switch that often.
Workload Automation AutoSys
Collector port (default port is 7004) is blocked or not available for connectivity between the nodes.
Consider a setup with two nodes - wcc-node1.example.com (IP address: XX:XX:XX:XX) and wcc-node2.example.com (IP address: YY:YY:YY:YY). wcc-node1 was already started. wcc-node2 is being started where it should connect to wcc-node1's collector and establish the cluster.
In the CA-wcc.log on wcc-node2, only one member shows up:
INFO | jvm 1 | 2024/08/21 12:27:59 | 19 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(96) Lets give 5 seconds for <default-collector> to get added, instancename : wcc-node2.example.comINFO | jvm 1 | 2024/08/21 12:28:04 | 24 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(98) Wait complete, check for <default-collector> node existanceINFO | jvm 1 | 2024/08/21 12:28:04 | 24 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(108) Check whether <default-collector> node exists - node wcc-node2.example.com, clusterport 7004INFO | jvm 1 | 2024/08/21 12:28:04 | 24 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClusterProviderDAO #(111) is collector node exists with host wcc-node2.example.com and port 7004? trueINFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientInvocationService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Running with 2 response threads, dynamic=trueINFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #LifecycleService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] HazelcastClient 4.2.1 (20210630 - 06a4018) is STARTINGINFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #LifecycleService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] HazelcastClient 4.2.1 (20210630 - 06a4018) is STARTEDINFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Trying to connect to cluster: EMDXWpbvpoINFO | jvm 1 | 2024/08/21 12:28:06 | 25 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Trying to connect to [wcc-node1.example.com]:7004INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] WARN #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Exception during initial connection to [wcc-node1.example.com]:7004: com.hazelcast.core.HazelcastException: java.io.IOException: Connection timed out: no further information to address wcc-node1.example.com/XX.XX.XX.XX:7004INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Trying to connect to [wcc-node2.example.com]:7004INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #LifecycleService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] HazelcastClient 4.2.1 (20210630 - 06a4018) is CLIENT_CONNECTEDINFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientConnectionManager #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Authenticated with server [wcc-node2.example.com]:7004:5749c579-92a9-43ab-9a4f-7dd22d645e89, server version: 4.2.1, local address: /YY.YY.YY.YY:49810INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #Diagnostics #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | @tomcat-resource <hz.client_1.event-3> [[]] INFO #ClientClusterService #(108) hz.client_1 [EMDXWpbvpo] [4.2.1] INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | Members [1] {INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | Member [wcc-node2.example.com]:7004 - 5749c579-92a9-43ab-9a4f-7dd22d645e89INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | }INFO | jvm 1 | 2024/08/21 12:28:27 | 44 | INFO | jvm 1 | 2024/08/21 12:28:28 | 44 | @tomcat-resource <WrapperStartStopAppMain> [[]] INFO #ClientStatisticsService #(69) Client statistics is enabled with period 5 seconds.INFO | jvm 1 | 2024/08/21 12:28:28 | 45 | Aug 21, 2024 12:28:28 PM com.ca.wcc.access.resources.BaseAccessProviderFactory initAccessConfigINFO | jvm 1 | 2024/08/21 12:28:28 | 45 | INFO: Loading EEM configuration... eemAppPropertySuffix='.autosysApp'INFO | jvm 1 | 2024/08/21 12:28:28 | 45 | Aug 21, 2024 12:28:28 PM com.ca.wcc.access
Ensure all WCC nodes can be resolved over network, and the collector/cluster port (default is 7004) is not blocked for communication between the WCC nodes. Restart WCC on all nodes for the change to be effective.