With the version of the spark-connector before 2.0, the spark-connector does not work well when the column name of the table has reserved words (for the list of the reserved words, refer to this link). Please refer to the example below:
1. Let's assume we have a table in which one of the columns is named as 'primary' which is a reserved word:
# \d+ spark_test
Table "public.spark_test"
Column | Type | Modifiers | Storage | Stats target | Description
---------+---------+-----------+----------+--------------+-------------
id | integer | | plain | |
primary | text | | extended | |
2. In spark-shell, create the data frame:
val gscReadOptionMap = Map(
"url" -> "jdbc:postgresql://mdw:5432/gpadmin",
"user" -> "gpadmin",
"password" -> "abc123",
"dbschema" -> "public",
"dbtable" -> "spark_test",
"partitionColumn" -> "id"
)
val gpdf = spark.read.format("greenplum")
.options(gscReadOptionMap)
.load()
3. Show the data, the spark-shell will report the error as shown below (the error is from spark-connector).
scala> gpdf.load().show()
20/10/06 14:41:34 ERROR Executor: Exception in task 0.0 in stage 4.0 (TID 6)
org.postgresql.util.PSQLException: ERROR: syntax error at or near "primary"
Position: 105
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2310)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2023)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:217)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:421)
...