Query gpfdist External Table Failed with the Message "HTTP/1.0 400 invalid request"
search cancel

Query gpfdist External Table Failed with the Message "HTTP/1.0 400 invalid request"

book

Article ID: 295416

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Symptoms:

When querying external table via the gpfdist protocol, it fails with error message “HTTP/1.0 400 invalid request.”

Error Message:

1. Query gpfdist based external table and failed:

gpadmin=# create external table ext_tt1(c1 int, c2 int, c3 int) location ('gpfdist://192.0.2.1:8081/BATCH 1/*.csv') format 'CSV';
CREATE EXTERNAL TABLE
Time: 221.612 ms
gpadmin=# select * from ext_tt1;
ERROR: http response code 400 from gpfdist (gpfdist://192.0.2.1:8081/BATCH 1/*.csv): HTTP/1.0 400 invalid request (url_curl.c:454) (seg0 slice1 sdw3:49000 pid=5609) (cdbdisp.c:1322)

2. Logs from the Greenplum segment:

2017-08-14 12:52:42.761391 CDT,"gpadmin","gpadmin",p5609,th-843743168,"192.0.2.1","47178",2017-08-14 12:52:38 CDT,379759,con54,cmd7,seg0,slice1,,x379759,sx1,"LOG","00000","quit fill_buffer due to still_running = 0, bot = 0, top = 0, want = 1, for_write = 0, error = 0, eof = 0, datalen = 0, maxfd = 18, nfds = 1, e = 0",,,,,,"select * from ext_tt1;",0,,"url_curl.c",715,
2017-08-14 12:52:43.828111 CDT,"gpadmin","gpadmin",p5609,th-843743168,"192.0.2.1","47178",2017-08-14 12:52:38 CDT,379759,con54,cmd7,seg0,slice1,,x379759,sx1,"ERROR","XX000","http response code 400 from gpfdist (gpfdist://192.0.2.1:8081/BATCH 1/*.csv): HTTP/1.0 400 invalid request (url_curl.c:454)",,,,,,"select * from ext_tt1;",0,,"url_curl.c",454,"Stack trace:

3. gpfdist logs:

2017-08-14 12:52:24 24311 INFO [0:7:0:9] ::ffff:192.0.2.2 requests /BATCH
2017-08-14 12:52:24 24311 INFO [0:7:0:9] got a request at port 42902:
 GET /BATCH 1/*.csv HTTP/1.1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] request headers:
2017-08-14 12:52:24 24311 INFO [0:7:0:9] Host:192.0.2.1:8081
2017-08-14 12:52:24 24311 INFO [0:7:0:9] Accept:*/*
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-XID:1502731703-0000000045
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-CID:0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SN:1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SEGMENT-ID:0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SEGMENT-COUNT:24
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-LINE-DELIM-LENGTH:-1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-PROTO:1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-MASTER_HOST:127.0.0.2
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-MASTER_PORT:54320
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-CSVOPT:m1x34q34h0n0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP_SEG_PG_CONF:/data1/primary/bloomtest/gpseg0/postgresql.conf
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP_SEG_DATADIR:/data1/primary/bloomtest/gpseg0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-DATABASE:gpadmin
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-USER:gpadmin
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SEG-PORT:49000
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SESSION-ID:54
2017-08-14 12:52:24 24311 INFO [0:12:0:16] reject invalid request from ::ffff:192.0.2.2
2017-08-14 12:52:24 24311 WARN [0:12:0:16] HTTP ERROR: ::ffff:192.0.2.2 - 400 invalid request 

 

Environment


Cause

A whitespace character is included in the LOCATION field in the definition of the external table. As shown in above example whitespace is included in "BATCH 1." 

According to GPDB document wildcards or other C-style pattern matching is accepted by gpfdist for denoting multiple files. And, a whitespace character should be specified as [[:space:]].

 

Resolution

Follow the document to specify whitespace character as necessary. 

For issue in above example, revise DDL of an external table as follows and then it could work:

gpadmin=# create external table ext_tt1(c1 int, c2 int, c3 int) location ('gpfdist://192.0.2.1:8081/BATCH[[:space:]]1/*.csv') format 'CSV';
CREATE EXTERNAL TABLE
gpadmin=# select * from ext_tt1;
 c1 | c2 | c3 
----+----+----
 1 | 2 | 3
(1 row)