HBase RegionServer fails after startup
search cancel

HBase RegionServer fails after startup

book

Article ID: 294600

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:

Each time the HBase RegionServer starts up, it fails after a certain amount of time.  

Environment


Cause

The clock skew between HBase Master and RegionServer nodes is too big.  
 

RCA

Whenever the RegionServer tries to connect to the HBase Master, it verifies if the time to connect is within  a tolerable range. If it is not, RegoinServer will shutdown itself. Check the RegionServer log file for error messages similar to the following: 

2017-08-01 14:51:15,984 FATAL [regionserver/hdw3.example.com/192.0.2.1:16020] regionserver.HRegionServer: ABORTING region server hdw3.example.com,16
020,1501570274049: Unhandled: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hdw3.example.com,16020,1501570274049 has been rejected; Reported t
ime is too far out of sync with master. Time difference of 311432ms > max allowed of 30000ms
 at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
 at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
 at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348)
......
org.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hdw3.example.com,16020,1501570274049 has been r
ejected; Reported time is too far out of sync with master. Time difference of 311432ms > max allowed of 30000ms
 at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
 at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
 at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348)
 at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.ja
va:8615)
......
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ClockOutOfSyncException): org.apache.hadoop.hbase.ClockOutOfSync
Exception: Server hdw3.example.com,16020,1501570274049 has been rejected; Reported time is too far out of sync with master. Time difference of 311432ms
 > max allowed of 30000ms
 at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
 at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
 at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348)
2017-08-01 14:51:16,013 INFO [regionserver/hdw3.example.com/192.0.2.1:16020] regionserver.HRegionServer: STOPPED: Unhandled: org.apache.hadoop.hbase
.ClockOutOfSyncException: Server hdw3.example.com,16020,1501570274049 has been rejected; Reported time is too far out of sync with master. Time differe
nce of 311432ms > max allowed of 30000ms
 at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
 at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
 at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348)
 at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.ja
va:8615)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)

Resolution

The maximum tolerable clock skew is configured with the HBase parameter hbase.master.maxclockskew which is 30000 ms by default.

There are two options to solve this issue:
 

1. Synchronize the time on all nodes within a Hadoop cluster. It's recommended to achieve this through NTP.
 

2. Increase the value of hbase.master.maxclockskew. This option is not recommended. Only consider this approach if synchronizing time on all nodes is not possible.

a. From Dashboard on Ambari web console, choose HBase -> Configs -> Advanced -> "Customer hbase-site"
b. Click "Add Property" if hbase.master.maxclockskew is not listed and enter Key/Value pair
c. Save the change and restart HBase service