When running a command that needs to access /user/gpadmin in HDFS as user gpadmin, the command may fail with the following error:
16/08/26 15:50:30 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down. org.apache.hadoop.security.AccessControlException: Permission denied: user=gpadmin, access=WRITE, inode="/user/gpadmin/.sparkStaging/application_1469700207857_0044":pxf:hdfs:drwxr-xr-x
For example, the following spark command will fail with the above error:
[gpadmin@sandbox ~]$ spark-shell --master yarn
Wrong ownership is set on the HDFS directory /user/gpadmin/ because of a software defect with the Ambari HAWQ Service check. The ownership is set to "pxf:hdfs" when it should be "gpadmin:hdfs".
This has been reported to engineering in GPSQL-3321.
Correct the ownership on /user/gpadmin/ by following these steps:
1. Log into the NameNode or the HDFS client host.
2. SU to "hdfs" user:
[root@sandbox ambari-agent]# su - hdfs [hdfs@sandbox ~]$3. Correct the ownership of /user/gpadmin/:
[hdfs@sandbox ~]$ hdfs dfs -ls /user/ Found 7 items drwxrwx--- - ambari-qa hdfs 0 2016-07-07 05:35 /user/ambari-qa drwxr-xr-x - pxf hdfs 0 2016-10-14 03:18 /user/gpadmin drwxr-xr-x - hcat hdfs 0 2016-06-28 08:14 /user/hcat drwx------ - hdfs hdfs 0 2016-10-14 03:12 /user/hdfs drwxr-xr-x - hive hdfs 0 2016-06-28 08:14 /user/hive drwxrwxr-x - spark hdfs 0 2016-06-28 08:12 /user/spark drwxr-xr-x - zeppelin zeppelin 0 2016-06-28 08:13 /user/zeppelin [hdfs@sandbox ~]$ hdfs dfs -chown gpadmin:hdfs /user/gpadmin/ [hdfs@sandbox ~]$ hdfs dfs -ls /user/ Found 7 items drwxrwx--- - ambari-qa hdfs 0 2016-07-07 05:35 /user/ambari-qa drwxr-xr-x - gpadmin hdfs 0 2016-10-14 03:18 /user/gpadmin drwxr-xr-x - hcat hdfs 0 2016-06-28 08:14 /user/hcat drwx------ - hdfs hdfs 0 2016-10-14 03:12 /user/hdfs drwxr-xr-x - hive hdfs 0 2016-06-28 08:14 /user/hive drwxrwxr-x - spark hdfs 0 2016-06-28 08:12 /user/spark drwxr-xr-x - zeppelin zeppelin 0 2016-06-28 08:13 /user/zeppelin [hdfs@sandbox ~]$4. Re-run the command that was previously failing.