Greenplum 6.1 goes into the Recovery Mode just after running query included with Lateral Join
search cancel

Greenplum 6.1 goes into the Recovery Mode just after running query included with Lateral Join

book

Article ID: 297056

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Queries, including a cross join, laterally fail and the database goes into recovery. The reason for the failure as per the database logs is due to the error message below:

2020-01-20 18:42:08.093437 KST,,,p25459,th0,,,2020-01-20 18:42:08 KST,0,con22,cmd2,seg4,slice2,,,,"PANIC","XX000","Unexpected internal error: Segment process received signal SIGSEGV",,,,,,,0,,,,"1    0x7fba332725d0 libpthread.so.0 <symbol not found> + 0x332725d0
2    0x6cfcee postgres <symbol not found> (execQual.c:2162)
3    0x6d5992 postgres ExecProject (execQual.c:6333)
4    0x6c8558 postgres ExecProcNode (execProcnode.c:1001)
5    0x6f5f20 postgres ExecSort (tuptable.h:159)
6    0x6c8418 postgres ExecProcNode (execProcnode.c:1089)
7    0x6ebb68 postgres ExecLimit (tuptable.h:159)
8    0x6c83a8 postgres ExecProcNode (execProcnode.c:1117)
9    0x706630 postgres ExecMotion (tuptable.h:159)
10   0x6c8398 postgres ExecProcNode (execProcnode.c:1121)
11   0x6bf8c9 postgres <symbol not found> (tuptable.h:159)
12   0x6c05f3 postgres standard_ExecutorRun (execMain.c:951)
13   0x88e4a7 postgres <symbol not found> (pquery.c:1150)
14   0x890451 postgres PortalRun (pquery.c:997)
15   0x88888e postgres <symbol not found> (postgres.c:1377)
16   0x88d4c9 postgres PostgresMain (postgres.c:5361)
"
2020-01-20 18:42:08.093568 KST,,,p25374,th919230592,,,,0,,,seg4,,,,,"LOG","00000","server process (PID 25459) was terminated by signal 11: Segmentation fault","Failed process was running: select t1.c1
	   ,t2.c1
	   ,t2.dist
from test1 t1
cross join lateral
( select
	c1
	,t1.c2 <-> t2.c2 as dist
  from test2 t2
  order by dist
  limit 1
) as t2

This article covers the cause of this issue, how can it be fixed, and how can it be prevented in any future occurrences.

Queries, including cross lateral joins, fail and the following error appears if the distance operator ( <-> ) of PostGIS is included when enabling optimizer = on. Eventually, the database goes into recovery mode and then all running sessions are aborted.

Warning : The distributed transaction 'Abort' broadcast failed to one or more segment for gid = xxx.
Notice: Releasing segworker groups to finish aborting the transaction.
Detail : This probably means the server terminated abnormally before or while processing the request.


Refer to the steps below to reproduce this issue:

1. Create database:

$ createdb testdb


2. Create tables:

$ vi crossjoin_postgis_error_case_1.sql

CREATE TABLE public.test1 (
	c1 text NULL,
	c2 geometry NULL
);

CREATE TABLE public.test2 (
	c1 text NULL,
	c2 geometry NULL
);

INSERT INTO public.test1 (c1,c2) VALUES
('111','POINT(52.3551151038034 4.87660185549649)')
,('111','POINT(52.3551151038034 4.87660185549649)')
,('111','POINT(52.3551151038034 4.87660185549649)')
;

INSERT INTO public.test2 (c1,c2) VALUES
('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
,('222','POINT(52.3549985339677 4.87622001617791)')
;

$ psql -d testdb -f crossjoin_postgis_error_case_1.sql


3. Run queries to reproduce issue:

$ vi crossjoin_postgis_error_case_2.sql
select t1.c1
	   ,t2.c1
	   ,t2.dist
from test1 t1
cross join lateral
( select
	c1
	,t1.c2 <-> t2.c2 as dist
  from test2 t2
  order by dist
  limit 1
) as t2
;

$ psql -d testdb -f ./crossjoin_postgis_error_case_2.sql
psql:./crossjoin_postgis_error_case_2.sql:13: ERROR: Error on receive from 
seg4 slice2 192.168.0.65:6000 pid=25459: server closed the connection 
unexpectedly DETAIL: This probably means the server terminated abnormally 
before or while processing the request.


Greenplum does not support the LATERAL Join Type as per Greenplum and PostgreSQL Compatibility described at the following link [1].

[1] Summary of Greenplum Features

Environment

Product Version: 6.1
OS: RHEL and CentOS 7.x

Resolution

Greenplum does not support the LATERAL Join Type

To avoid this issue in the above examplem remove the "limit 1" in the select query. However, it may result in different dataset.

Re-write the queries with a different method supported by Greenplum or engage Data Professional Service if further assistance is needed.

Greenplum 6.5.0 and above allows the query to fail gracefully and avoid the panic.