Could not write to temporary file: No space left on device
search cancel

Could not write to temporary file: No space left on device

book

Article ID: 295384

calendar_today

Updated On:

Products

VMware Tanzu Greenplum Pivotal Data Suite Non Production Edition VMware Tanzu Data Suite VMware Tanzu Data Suite Greenplum

Issue/Introduction

When performing a hash join against very large datasets, the following error is produced: 

- Extreme skew in the innerside of Hashjoin
- could not write to temporary file: No space left on device

 

Error Messages from pg_log:

"Extreme skew in the innerside of Hashjoin, nbatch 32, mintuples 6672, maxtuples 1258717",,,,,,"
create table summary_topic_tmp
with (appendonly=true, orientation=parquet, compresstype=snappy) as
select t2.reader_id, t3.topic_id, sum(t2.pv) as pv from
(
 select st.reader_id, st.vertical_id, sum(st.pv) as pv from
 (
 select date, reader_id, vertical_id, pv from daily_201609
 union
 select date, reader_id, vertical_id, pv from daily_201608
 union
 select date, reader_id, vertical_id, pv from daily_201607
 union
 select date, reader_id, vertical_id, pv from daily_201606
 ) st
 where st.date between '2016-06-26' and '2016-09-24'
 group by st.reader_id, st.vertical_id
) t2
inner join table02 t3
on t2.vertical_id = t3.vertical_id
group by t2.reader_id, t3.topic_id;
:
:

"ERROR","58030","could not write to temporary file: No space left on device",,,,,,"

Environment


Cause

The errors indicate that the statistics for some/all of the tables used in the query are not up-to-date. This caused extreme data skew and subsequent large spills on a one or more segment nodes which caused the No space left on device error.

Analyze the EXPLAIN PLAN generated by the Greenplum database for the query. If some slices show very low number of rows or rows=1, It indicates that the table statistics may not be up-to-date. 

:
-> Table Scan on daily_201609 (cost=0.00..431.00 rows=1 width=28) 
Filter: date >= '2016-06-26'::date AND date <= '2016-09-24'::date
:

Resolution

Fix

Run ANALYZE against all the tables referenced in the query.

It would also be advisable to run VACUUM on each of the tables before ANALYZE.