A customer frequently encountered the following error: "PXF server error: This feature is disabled. Please refer to dfs.client.block.write.replace-datanode-on-failure.enable when running specific PXF jobs."
UTC,"username","dbname",p46594,th1531054208,"10.x.x.x","58996",2024-09-04 16:47:40 UTC,0,con219152,cmd17,seg-1,,dx1118568,,sx1,"ERROR","P0001","Error during transaction
Error Code : 08000
Message : PXF server error : This feature is disabled. Please refer to dfs.client.block.write.replace-datanode-on-failure.enable configuration property. (seg99 10.x.x.x:40003 pid=311859)
Detail :
Hint : Check the PXF logs located in the '/usr/local/pxf-gp6/logs' directory on host 'localhost' or 'set client_min_messages=LOG' for additional details.
Context : SQL statement ""insert into sch01.pxf_tbl01_ext (col01,col02,col03) select col01,col02,col03 from sch01.tbl_02""
PL/pgSQL function sch02.function01(text,text,boolean) line 67 at EXECUTE statement",,,,,,"select sch02.function02('val01', 'val02', true); --DAG:details01 TASK:hdfs_attribute_client.postgres",0,,"pl_exec.c",3072,
Prod:
GPDB: 6.25.1 and above versions
PXF: 6.10.2 and above versions
The issue can potentially occur for some key reasons:
1. If the HDFS client (such as PXF) attempts to write data but the feature to automatically replace failed DataNodes in the write pipeline (dfs.client.block.write.replace-datanode-on-failure.enable) is disabled, then any DataNode failure during writing will cause the entire write operation to fail instead of recovering by selecting a new healthy DataNode.
2. The problem may also arise during periods of high cluster workload or network instability, where the communication between PXF hosts and the HDFS cluster is prone to timeouts, packet loss, or connection failures. These conditions increase the risk of DataNode failures or disconnections, which, when combined with the pipeline replacement feature being disabled or misconfigured, lead to elevated rates of client-side write errors
To reduce the incidents of this error, set the following properties in $PXF_HOME/servers/XXX/hdfs-site.xml
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.min-replication</name>
<value>2</value>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.best-effort</name>
<value>true</value>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
<value>true</value>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>ALWAYS</value>
</property>
The above settings ensure that the HDFS client will attempt to automatically replace any failed DataNodes in the write pipeline, allowing writes to continue whenever possible. This configuration provides higher resilience against DataNode failures and transient network issues, which significantly reduces the likelihood of encountering pipeline failure errors in Greenplum PXF.