A Hive query fails with the following message:
Error: java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to java.lang.String.
hive> select count(*) from test_table where tag='asa aaklk'; Query ID = root_20160615225151_d4271570-2593-4f99-a12a-10bfddbac1bc Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1465444193706_2444, Tracking URL = http://history_server:8088/proxy/application_1465444193706_2444/ Kill Command = /usr/phd/current/hadoop-client/bin/hadoop job -kill job_1465444193706_2444 Hadoop job information for Stage-1: number of mappers: 6; number of reducers: 1 2016-06-15 22:51:45,011 Stage-1 map = 0%, reduce = 0% 2016-06-15 22:52:06,220 Stage-1 map = 100%, reduce = 100% Ended Job = job_1465444193706_2444 with errors Error during job, obtaining debugging information... Examining task ID: task_1465444193706_2444_m_000004 (and more) from job job_1465444193706_2444 Examining task ID: task_1465444193706_2444_m_000003 (and more) from job job_1465444193706_2444 Task with the most failures(4): ----- Task ID: task_1465444193706_2444_m_000003 URL: http://history_server:8088/taskdetails.jsp?jobid=job_1465444193706_2444&tipid=task_1465444193706_2444_m_000003 ----- Diagnostic Messages for this Task: Error: java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to java.lang.String at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to java.lang.String at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(VectorizedRowBatchCtx.java:566) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:86) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:43) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347) ... 13 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 6 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec
This is caused by a know Hive software defect (https://issues.apache.org/jira/browse/HIVE-11054).
Upgrade your Hadoop cluster to the latest version or follow below instructions if you are using an older version:
Although this is fixed in the later version of Hive, those versions are not available in Pivotal HD. A workaround to resolve this issue is to run the following before running the query:
set hive.vectorized.execution.enabled=false;