Loading from external tables error "gpfdist error - line too long in file"
search cancel

Loading from external tables error "gpfdist error - line too long in file"

book

Article ID: 295620

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:

When loading data from external tables with gpfdist, the query fails with the following error:

ERROR: gpfdist error - line too long in file /tmp/1.log near (0 bytes) (url.c:1746) (seg13 slice1 sdw3:43001 pid=23691) (cdbdisp.c:1499)
DETAIL: External table new_test, file gpfdist://mdw:8090/1.log

Environment


Cause

The default row limit for external tables using gpfdist is 32KB as documented. If certain rows are longer than 32KB, the query will error out wit the message "line too long in file".


Here is more information regarding the test case:

msong=# create external table new_test ( a text) location('gpfdist://mdw:8090/1.log') FORMAT 'TEXT' (DELIMITER '|');
CREATE EXTERNAL TABLE
msong=# select * from new_test;
ERROR: gpfdist error - line too long in file /tmp/1.log near (0 bytes) (url.c:1746) (seg13 slice1 sdw3:43001 pid=23691) (cdbdisp.c:1499)
DETAIL: External table new_test, file gpfdist://mdw:8090/1.log
msong=#

gpfdist verbose logs contains the following 500 session error:

ps -ef|grep gpfdist
gpadmin  20727 27150  0 13:37 pts/5    00:00:00 gpfdist -d /tmp -l /tmp/1.log -p 8090 -V
cat /tmp/1.log
[2014-06-18 13:39:14] ::ffff:172.28.8.7 - 500 session error
[44] request end
---------------------------------------------------
[2014-06-18 13:39:14] ::ffff:172.28.8.7 requests /1.log
[2014-06-18 13:39:14] [45] got a request: GET /1.log HTTP/1.1
[2014-06-18 13:39:14] request headers:Host:172.28.8.250:8090
Accept:*/*
X-GP-XID:1402370147-0000043790
X-GP-CID:0
X-GP-SN:0
X-GP-SEGMENT-ID:38
X-GP-SEGMENT-COUNT:48
X-GP-LINE-DELIM-LENGTH:-1
X-GP-PROTO:1
X-GP-MASTER_HOST:172.28.8.250
X-GP-MASTER_PORT:4300
X-GP-CSVOPT:m0x92q0h0
X-GP_SEG_PG_CONF:/data1/primary_4300/gpseg38/postgresql.conf
X-GP_SEG_DATADIR:/data1/primary_4300/gpseg38
X-GP-DATABASE:msong
X-GP-USER:gpadmin
X-GP-SEG-PORT:43002
X-GP-SESSION-ID:83085
[2014-06-18 13:39:14] ::ffff:172.28.8.7 - 500 session error
[45] request end
---------------------------------------------------

Resolution

Use the -m option to increase the max row length for gpfdist. This value can be increased up to 256MB.


For example, increasing the value up to 64KB solved the issue in our test case.

gpfdist -t 600 -d /tmp -l /tmp/1.log  -p 8090 -V -m 655350 &

Note: If using gpload, can pass the -m parameter value to gpfdist using the MAX_LINE_LENGTH parameter in the YAML file.