The Hive LOAD operation executed from Zeppelin fails with the error message "Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask"
. However, the table is loaded with the correct data.
The following error message is reported in Zeppelin Notebook:
java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:283) at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291) at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291) at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:577) at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:660) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:489) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
We can verify that the input file permissions are: hive:hdfs
. The following error appears in Hive Server logs:
2017-10-17 16:21:21,714 ERROR [HiveServer2-Background-Pool: Thread-384264]: exec.Task (SessionState.java:printError(948)) - Failed with exception org.apache.hadoop.security.AccessControlException: Permission denied. user=anonymous is not the owner of inode=notiz.csv at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:250) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:227) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1780) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setPermission(FSDirAttrOp.java:63) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1685) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:814) at [...]
Observe that Zeppelin is connecting to Hive as user "anonymous" (scroll the last log snippet to the right).
Zeppelin is connecting to Hive using the JDBC Driver and the user is "anonymous". For Zeppelin to connect to Hive, set hive.user
to hive
in the JDBC Driver configuration. Ignore the configuration of hive.username
, it must be hive.user.