Hive query fails with "HiveVarchar cannot be cast to java.lang.String"

Products

Services Suite

Issue/Introduction

Symptoms:

A Hive query fails with the following message:

Error: java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to java.lang.String.

The error message is shown below:

hive> select count(*) from test_table where tag='asa aaklk';
Query ID = root_20160615225151_d4271570-2593-4f99-a12a-10bfddbac1bc
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
 set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
 set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
 set mapreduce.job.reduces=<number>
Starting Job = job_1465444193706_2444, Tracking URL = http://history_server:8088/proxy/application_1465444193706_2444/
Kill Command = /usr/phd/current/hadoop-client/bin/hadoop job -kill job_1465444193706_2444
Hadoop job information for Stage-1: number of mappers: 6; number of reducers: 1
2016-06-15 22:51:45,011 Stage-1 map = 0%, reduce = 0%
2016-06-15 22:52:06,220 Stage-1 map = 100%, reduce = 100%
Ended Job = job_1465444193706_2444 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1465444193706_2444_m_000004 (and more) from job job_1465444193706_2444
Examining task ID: task_1465444193706_2444_m_000003 (and more) from job job_1465444193706_2444
Task with the most failures(4):
-----
Task ID:
 task_1465444193706_2444_m_000003
URL:
 http://history_server:8088/taskdetails.jsp?jobid=job_1465444193706_2444&tipid=task_1465444193706_2444_m_000003
-----
Diagnostic Messages for this Task:
Error: java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to java.lang.String
 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
 at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352)
 at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
 at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
 at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to java.lang.String
 at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(VectorizedRowBatchCtx.java:566)
 at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:86)
 at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:43)
 at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347)
 ... 13 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 6 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

Environment

Cause

This is caused by a know Hive software defect (https://issues.apache.org/jira/browse/HIVE-11054).

Resolution

Upgrade your Hadoop cluster to the latest version or follow below instructions if you are using an older version:

Although this is fixed in the later version of Hive, those versions are not available in Pivotal HD. A workaround to resolve this issue is to run the following before running the query:

set hive.vectorized.execution.enabled=false;

It is not recommended to set hive.vectorized.execution.enabled to "false" for all queries through Ambari as this may cause performance issues.