TLS enabled VMware Tanzu GemFire service instances upgrade fails during upgrade to PCC 1.9.0
search cancel

TLS enabled VMware Tanzu GemFire service instances upgrade fails during upgrade to PCC 1.9.0

book

Article ID: 294317

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

When TLS enabled service instances are upgraded to VMware Tanzu GemFire 1.9.0, the upgrade process fails.

You will see errors like the one below in the GemFire locator and server logs.
[warning 2019/10/23 18:35:50.254 UTC locator-3df92bb6-816b-469d-a37f-dc4dc3c713b0 <ThreadsMonitor> tid=0x10] Thread <132> that was executed at <23 Oct 2019 18:34:27 UTC> has been stuck for <82.465 seconds> and number of thread monitor iteration <1> 
  Thread Name <Pooled High Priority Message Processor 3>
  Thread state <TIMED_WAITING>
  Waiting on <org.apache.geode.internal.tcp.ConnectionTable$PendingConnection@efe024c>
  Executor Group <PooledExecutorWithDMStats>
  Monitored metric <ResourceManagerStats.numThreadsStuck>
  Thread Stack:
  java.lang.Object.wait(Native Method)
  org.apache.geode.internal.tcp.ConnectionTable$PendingConnection.waitForConnect(ConnectionTable.java:1204)
  org.apache.geode.internal.tcp.ConnectionTable.getSharedConnection(ConnectionTable.java:426)
  org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:598)
  org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:947)
  org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:557)
  org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:336)
  org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:251)
  org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:616)
  org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1692)
  org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1870)
  org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2865)
  org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:2785)
  org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2824)
  org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1523)
  org.apache.geode.internal.cache.UpdateAttributesProcessor$ProfileReplyMessage.send(UpdateAttributesProcessor.java:395)
  org.apache.geode.internal.cache.UpdateAttributesProcessor$UpdateAttributesMessage.process(UpdateAttributesProcessor.java:320)
  org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:367)
  org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:432)
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:949)
  org.apache.geode.distributed.internal.ClusterDistributionManager.doHighPriorityThread(ClusterDistributionManager.java:827)
  org.apache.geode.distributed.internal.ClusterDistributionManager$$Lambda$76/173060252.invoke(Unknown Source)
  org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:121)
  org.apache.geode.internal.logging.LoggingThreadFactory$$Lambda$74/15094126.run(Unknown Source)
 
[info 2019/10/16 01:25:22.967 UTC <locator request thread 3> tid=0x14ad] Exception in processing request from 10.213.121.123
javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:154)
	at sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2020)
	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1127)
	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
	at org.apache.geode.internal.net.SocketCreator.handshakeIfSocketIsSSL(SocketCreator.java:986)
	at org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:354)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)


Environment

Product Version: 1.9

Resolution

There was a bug in the product which prevents TLS based VMware Tanzu GemFire service instances to be upgraded to VMware Tanzu GemFire 1.9.0 (from any lower version). To fix this, a new VMware Tanzu GemFire version was released (VMware Tanzu GemFire 1.9.1) and VMware Tanzu GemFire 1.9.0 was pulled out. If you are planning to upgrade from 1.8.x to 1.9.0 you should instead pick 1.9.2 or higher (see release notes of VMware Tanzu GemFire 1.9.x)

Non-TLS upgrades should go through. However, since 1.9.0 has been pulled out, latest version of VMware Tanzu GemFire 1.9.x should be used instead.