Pivotal HDB initialization failed with : [FATAL]: create dfs filespace failed ERROR: could not create filespace directory hdfs://.... Input/output error
search cancel

Pivotal HDB initialization failed with : [FATAL]: create dfs filespace failed ERROR: could not create filespace directory hdfs://.... Input/output error

book

Article ID: 294589

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:

Pivotal HDB initialization failed, error message shown while initialization

[gpadmin@hawq-mdw utils]$ gpinitsystemåÊ -c gpinitsystem_config -h hostfile
...
...
20131010:16:44:49:021342 gpinitsystem:hawq-mdw:gpadmin-[INFO]:-Create filespace dfs_system
20131010:16:44:57:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:
20131010:16:44:57:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:-Failed to create dfs filespace; review gpinitsystem output to
20131010:16:44:57:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:- determine why this step failed and reinitialize cluster after resolving
20131010:16:44:57:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:- issues.åÊ Not all initialization tasks have completed so the cluster
20131010:16:44:57:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:- should not be used.
20131010:16:44:57:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:-gpinitsystem will now try to stop the cluster
20131010:16:44:57:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:
20131010:16:44:58:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Starting gpstop with args: -a -i -d /data/master/gpseg-1
..
..
20131010:16:45:00:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Greenplum Version: 'postgres (HAWQ) 4.2.0 build 1'
20131010:16:45:00:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-There are 0 connections to the database
20131010:16:45:00:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='immediate'
20131010:16:45:00:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Master host=hawq-mdw
20131010:16:45:00:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=immediate
20131010:16:45:00:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Master segment instance directory=/data/master/gpseg-1
20131010:16:45:01:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-No standby master host configured
20131010:16:45:01:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Commencing parallel segment instance shutdown, please wait...
...
20131010:16:45:04:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-----------------------------------------------------
20131010:16:45:04:001190 gpstop:hawq-mdw:gpadmin-[INFO]:- Segments stopped successfully = 2
20131010:16:45:04:001190 gpstop:hawq-mdw:gpadmin-[INFO]:- Segments with errors during stop = 0
20131010:16:45:04:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-----------------------------------------------------
20131010:16:45:04:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Successfully shutdown 2 of 2 segment instances
20131010:16:45:04:001190 gpstop:hawq-mdw:gpadmin-[INFO]:-Database successfully shutdown with no errors reported
20131010:16:45:04:021342 gpinitsystem:hawq-mdw:gpadmin-[INFO]:-Successfully shutdown the Greenplum instance
20131010:16:45:04:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:
20131010:16:45:04:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:-Failed to create dfs filespace; review gpinitsystem output to
20131010:16:45:04:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:- determine why this step failed and reinitialize cluster after resolving
20131010:16:45:04:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:- issues. Not all initialization tasks have completed so the cluster
20131010:16:45:04:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:- should not be used.
20131010:16:45:04:021342 gpinitsystem:hawq-mdw:gpadmin-[WARN]:
20131010:16:45:04:gpinitsystem:hawq-mdw:gpadmin-[FATAL]: create dfs filespace failed; Script Exiting!


 

gpinitsystem log in /home/gpadmin/gpAdminLogs/ shows the below snippet

20131010:16:44:49:021342 gpinitsystem:hawq-mdw:gpadmin-[INFO]:-DFS_PATH_LIST: 1:'/data/master/dfs/gpseg-1',2:'hawq-mdw:9000/hawq/gpseg0',3:'hawq-mdw:9000/hawq/gpseg1'
20131010:16:44:49:021342 gpinitsystem:hawq-mdw:gpadmin-[INFO]:-Create filespace dfs_system
WARNING: function 1 returned error: -1
WARNING: fail to connect hdfs at hawq-mdw:9000, errno = 5
WARNING: function 1 returned error: -1
WARNING: fail to connect hdfs at hawq-mdw:9000, errno = 5
WARNING: function 1 returned error: -1
WARNING: fail to connect hdfs at hawq-mdw:9000, errno = 5
WARNING: function 1 returned error: -1
CONTEXT: Dropping file-system object -- Filespace Directory: '16384'
WARNING: fail to connect hdfs at hawq-mdw:9000, errno = 5
CONTEXT: Dropping file-system object -- Filespace Directory: '16384'
WARNING: could not remove filespace directory 16384: Input/output error
CONTEXT: Dropping file-system object -- Filespace Directory: '16384'
ERROR: could not create filespace directory hdfs://hawq-mdw:9000/hawq/gpseg0: Input/output error


 

Environment


Cause

During Initialization hdb was unable to create the directory structure in HDFS using URI address hdfs:/hawq-mdw:9000/. In other words, inialization errored out while accessing hdfs filesystem using the given URI.

In this case the port number 9000 configured for param DFS_URL in the /etc/gphd/hawq/conf/gpinitsystem_config is not correct. 

[gpadmin@hawq-mdw hadoop]$ hadoop fs -ls hdfs://hawq-mdw:9000/
ls: Call From hawq-mdw.saturn.local/192.165.100.31 to hawq-mdw:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
[gpadmin@hawq-mdw hadoop]$


 

lskdjflsdgpinitsystem_config file DFS_URL param

[gpadmin@hawq-mdw utils]$ egrep DFS_URL gpinitsystem_config
DFS_URL=hawq-mdw:9000/hawq

 

Resolution

Identify the correct host and port information from the clusters /etc/gphd/hadoop/conf/core-site.xml

<property>
<name>fs.defaultFS</name>
<value>hdfs://hawq-mdw:8020</value>
</property>

Verify the core-site.xml URI path is correct

[gpadmin@hawq-mdw conf]$ hadoop fs -ls hdfs://hawq-mdw:8020/
Found 3 items
drwxr--r--   - hdfs   supergroup          0 2013-10-10 17:43 hdfs://hawq-mdw:8020/hawq
drwxr-xr-x   - mapred hadoop              0 2013-10-10 17:31 hdfs://hawq-mdw:8020/mapred
drwxr-xr-x   - hdfs   supergroup          0 2013-10-10 16:24 hdfs://hawq-mdw:8020/user


 

Change the value in gpinitsystem_config to below and perform gpinitsystem again

DFS_URL=hawq-mdw:8020/hawq