gpfdist/gpload/copy
, the following error message is produced ERROR: invalid byte sequence for encoding "UTF8": 0xc942:
msong=# copy source_address from '/data/msong_env/cases/SR59883770/source_address.dat.0001' WITH DELIMITER '|' ; ERROR: invalid byte sequence for encoding "UTF8": 0xc942 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". CONTEXT: COPY source_address, line 39097
The encoding of files from external sources and the encoding in the specified client_encoding are different in Greenplum.
msong=# show client_encoding; client_encoding ----------------- UTF8 (1 row)
As a result, the database is not able to recognize the external files and fails with the error message ERROR: invalid byte sequence for encoding "UTF8": 0xc942.
The external source files must always be encoded with the same encoding as the client_encoding.
Note: Confirm that the output of the following commands matches:
1. Use the following command to check the client_encoding of target database:
show client_encoding;
2. Use the following command to check the encoding of the external files:
file <filename>
When loading or inserting data into Greenplum Database, Greenplum converts the data from the specified client encoding into the server encoding. When sending data back to the client, Greenplum converts the data from the server character encoding into the specified client encoding.