Collecting from SRM Appliance fails in Usage Meter
search cancel

Collecting from SRM Appliance fails in Usage Meter

book

Article ID: 328569

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

Symptoms:
During a fresh install of Site Recovery Manager (SRM) or upgrade of SRM from a version before 8.2, you experience these symptoms:
  • Unable to connect to or collect from SRM 8.2.
  • You see an error in the User Interface (UI) similar to:

    There was a problem checking the certificate for <vcenter instance>.
     
  • On the freshly installed SRM Server, in the /var/log/usgmtr/um.log file, you see entries similar to:

    INFO [ForkJoinPool-3-worker-1] auth.CertUtil: Checking certificate of example.com:9086
    2019-08-12 16:41:46,482  INFO [ForkJoinPool-3-worker-1] auth.CertUtil: Start SSL handshake
    2019-08-12 16:41:46,488  INFO [default-akka.actor.default-dispatcher-142] rest.PushQueues: Message sent: MonitorServerEvent(VcServerT)
    2019-08-12 16:41:46,494  WARN [ForkJoinPool-3-worker-1] auth.CertUtil: Unrecognized SSL message, plaintext connection?
    javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
            at sun.security.ssl.InputRecord.handleUnknownRecord(InputRecord.java:710)
            
  • On the upgraded SRM server, in the /var/log/usgmtr/collector.log or /var/log/usgmtr/error.log files, you see entries similar to:

    ERROR [Collector 2] srm.SrmCollector: VMware vCenter Site Recovery Manager collection failed: com.sun.xml.internal.ws.client.ClientTransportException: HTTP transport error: java.net.ConnectException: Connection timed out (Connection timed out)

    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.


Cause

This issue occurs because the new Site Recovery Manager Appliance (SRM-VA) uses the default 443 port unlike the Site Recovery Manager server communication which is using port 9086. Usage meter is still trying to connect to SRM on port 9086 and so fails when the port is different.

Resolution

This issue is resolved in VMware Usage Meter 3.6.1 Hot Patch 3, available at VMware Downloads.

Workaround:
To work around this issue if you do not want to upgrade:
  1. Get the IP Addresses of the Site Recovery Manager servers. 

    Note: You may need to run the "sysctl -w net.ipv4.conf.all.route_localnet=1" command to enable the localhost/localnet route processing for your outbound interface.
     
  2. Add these configurations to the iptables with these commands, replacing SRM_IP_1 and SRM_IP_2:

    For example:

    iptables -t nat -A OUTPUT -p tcp --dst SRM_IP_1 --dport 9086 -j DNAT --to-destination SRM_IP_1:443
    iptables -t nat -A POSTROUTING -p tcp --dst SRM_IP_1 --dport 9086 -j MASQUERADE

    iptables -t nat -A OUTPUT -p tcp --dst SRM_IP_2 --dport 9086 -j DNAT --to-destination SRM_IP_2:443
    iptables -t nat -A POSTROUTING -p tcp --dst SRM_IP_2 --dport 9086 -j MASQUERADE

     
  3. Add SRM 8.2 and collect from it.
Note: If the appliance restarts, the iptables rules will disappear and you would need to rerun the commands or you can add the commands to execute on system boot. For example: Add the commands from point 2 in the /root/.bash_profile file.