This issue happens when the data directories in ECS bucket do not exist, either because they are manually deleted or cleaned up by some processes.
Error message shows as follows:
[2025-11-09, 06:17:49 PST] {xxxxxx.py:xxxx} ERROR - Task failed with exception Traceback (most recent call last):File "/usr/xxx/xxx/operators/gpdb_xxx_xxxx.py", line 100, in executecur.execute(sql)psycopg2.errors.RaiseException: 08000 PXF server error : innerMkdirs on s3a://xxxx/xxx/xxx/xxxxx: com.amazonaws.services.s3.model.AmazonS3Exception: Indicates that the version ID specified in the request does not match an existing version. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchVersion; Request ID: 0ae95c87:19a64bc2054:192:3f; S3 Extended Request ID: null; Proxy: null), S3 Extended Request ID: null (seg2 10.xx.xx.xx:xxxx pid=1438325)
GPDB 6.x
PXF versions under 6.11.2
When Greenplum writes data to an external system, it first checks whether the target directory exists. If the directory is missing, every segment tries to create it, and normally this is harmless because creating an existing directory should do nothing.
But in S3-compatible systems, each directory-create request updates the directory's version. So when multiple segments try to create the directory at the same time, the version keeps changing. This leads to random failures, such as:
AmazonS3Exception: the version ID in the request does not match an existing version
This can happen even when another segment already successfully created the directory.
PXF v6.11.2 and above now have the enhancement to handle this situation. The enhancement mainly does 3 things:
mkdir request per segment host if the intermediate directory does not already exist.