When configuring Greenplum to write data to an S3-compatible object storage using the s3:// protocol (via gpcheckcloud or INSERT INTO an external table), downloads succeed, but uploads consistently fail at the final stage of the process.
The failure manifests with an HTTP 400 Bad Request error. In the terminal or Greenplum segment logs, the following error sequence is observed:
$ gpcheckcloud -u <file_to_upload> "s3://<S3_endpoint>/<bucketname>/[<S3_prefix>] [config=<path_to_config_file>]"
[D] Upload part finish: <thread_id>, eTag: "<etag_value>", part number: 1
S3MessageParser.xml:1: parser error : Extra content at the end of the document
<h1>Bad Message 400</h1><pre>reason: Transfer-Encoding and Content-Length</pre>
[E] AWS returns error Unknown value : Unknown value
[E] writer_cleanup caught S3LogicError: writer_cleanup caught a S3LogicError exception: AWS returns error Unknown value : Unknown value, Function: completeMultiPart, File: ../../src/s3interface.cpp(557).
Product: Greenplum Database 7.x
Components: gpcheckcloud utility, s3:// Protocol
Target Environment: Strict S3-compatible object storage backends (e.g., Dell ECS, strict HAProxy/Nginx load balancers)
This issue is caused by a software defect in the Greenplum gpcloud plugin's HTTP client implementation when interacting with the underlying libcurl library.
This issue will be resolved in Grenplum 7.8 and above. Upgrade to that version is required or at least you will need to replace the gpcloud.so and gpcheckcloud binary with the new one if you don't want to upgrade.