Hive LOAD from Zeppelin fails with "Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask"
search cancel

Hive LOAD from Zeppelin fails with "Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask"

book

Article ID: 294823

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:

The Hive LOAD operation executed from Zeppelin fails with the error message "Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask". However, the table is loaded with the correct data.


The following error message is reported in Zeppelin Notebook:

java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask 
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:283) 
at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291) 
at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291) 
at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:577) 
at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:660) 
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94) 
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:489) 
at org.apache.zeppelin.scheduler.Job.run(Job.java:175) 
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)

We can verify that the input file permissions are: hive:hdfs. The following error appears in Hive Server logs:

2017-10-17 16:21:21,714 ERROR [HiveServer2-Background-Pool: Thread-384264]: exec.Task (SessionState.java:printError(948)) - Failed with exception org.apache.hadoop.security.AccessControlException: Permission denied. user=anonymous is not the owner of inode=notiz.csv
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:250)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:227)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1780)
        at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setPermission(FSDirAttrOp.java:63)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1685)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:814)
        at [...]

Observe that Zeppelin is connecting to Hive as user "anonymous" (scroll the last log snippet to the right).

Environment


Cause

Zeppelin is connecting to Hive using the JDBC Driver and the user is "anonymous". For Zeppelin to connect to Hive, set hive.user to hive in the JDBC Driver configuration. Ignore the configuration of hive.username, it must be hive.user.