Symptom
HDB fails to query against a Hive external table using multibyte delimiter through PXF.
Error Message:
gpadmin=# select * from multi; ERROR: remote component error (500) from '172.28.21.189:51200': type Exception report message java.lang.Exception: java.lang.ClassNotFoundException: org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe description The server encountered an internal error that prevented it from fulfilling this request. exception javax.servlet.ServletException: java.lang.Exception: java.lang.ClassNotFoundException: org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe (libchurl.c:897) (seg4 hdw1.hdp.local:40000 pid=672562) (dispatcher.c:1801) DETAIL: External table multi
Cause
The required class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe"
for the multibyte delimiter processing exists in the hive-contrib.jar. However, the Jar file "hive-contrib.jar" is not in the default Java classpath for PXF service.
Resolution
1. Locate the path for hive-contrib.jar:
# locate jar | grep hive | grep contri /usr/hdp/2.4.0.0-169/hive/lib/hive-contrib-1.2.1000.2.4.0.0-169.jar /usr/hdp/2.4.0.0-169/hive/lib/hive-contrib.jar ### <<<< Use this one as it's a symbol link to above one /usr/hdp/2.4.0.0-169/storm/contrib/storm-hive/storm-hive-0.10.0.2.4.0.0-169.jar
2. From Ambari -> PXF -> Configs -> Advanced pxf-public-classpath, add the identified path:
: /usr/hdp/current/hive-client/lib/datanucleus* /usr/hdp/2.4.0.0-169/hive/lib/derby-10.10.2.0.jar /usr/hdp/2.4.0.0-169/hive/lib/hive-contrib.jar # <<<!!! Add this line identified above.
3. Restart PXF services from Amabari, and now you should be able to access the Hive external table:
gpadmin=# select count(*) from multi; count ------- 16 (1 row)