Append-optimized, column-oriented tables store columns individually. Each column file contains certain metadata which assist index scan etc. If the metadata between different columns become inconsistent, that could lead to error or wrong result, depending on whether the table has index or not:
For table without index, the analyze error message as below:
gpadmin=# analyze t2;
ERROR: Unexpected result was returned when getting AO block info for table 't2', column 0, targrow 104. (aocsam.c:1404) (seg3 slice1 xxx.xxx.xxx.xxx:6752 pid=3750) (aocsam.c:1404)
gpdb 7.1.0 - 7.5.4
Greenplum 7.1 optimized ALTER COLUMN [TYPE | SET ENCODING] operation for append-optimized, column-oriented tables by avoiding full table rewrite, instead rewriting just the column being altered. However, there is a bug where the altered column could have inconsistent block headers with other columns in the table. This issue has been fixed in 7.5.5, so any column-oriented table altered in newer versions would not be impacted. But any column-oriented tables that were altered between 7.1.0 and 7.5.4 would need to be detected and fixed.
Rewriting the impacted table will re-align the metadata in all columns. Use “ALTER TABLE … SET WITH (REORGANIZE)” command to do that. Note that, if the table is large, rewriting can take some time.
Use the attached script to find all append-optimized, column-oriented tables that have done ALTER COLUMN [TYPE|SET ENCODING] operation since table creation or the last table rewrite time. This initial filtering helps determine the next action to take. The tables are categorized into two groups:
We strongly recommend you fix all tables. If the number of tables is large, prioritize fixing Category A tables, and move on to fix Category B later.
The detailed fix steps follows below:
Note: if many leaf partition tables that belong to the same partition hierarchy need to be fixed, consider only reorganize the root partition if the partition hierarchy is all column-oriented tables. Refer to the root partition name displayed in the csv files.
psql postgres -f reorg_cmds_category_a.sql
psql postgres -f reorg_cmds_category_b.sql
gpcheckcat -A -R 'co_blkdir_rownums'
Directly add execution permission and run this script will generate check result and fix script at the current path:
#!/bin/bash
set -u
if [[ -z "${PGPORT:-}" ]]; then
echo "Please set PGPORT"
exit 1
fi
# output files
FIND_OUT1="category_a.csv"
FIND_OUT2="category_b.csv"
GEN_OUT1="reorg_cmds_category_a.sql"
GEN_OUT2="reorg_cmds_category_b.sql"
PSQL="psql -X -v ON_ERROR_STOP=1 -At"
# get available databases
DBS="$($PSQL -d postgres -c "SELECT datname FROM pg_database
WHERE datallowconn
ORDER BY 1;")"
[[ -n "$DBS" ]] || { echo "No databases to process."; exit 0; }
# start new output files
echo "dbname|tablename|rootpartition" > "$FIND_OUT1"
echo "dbname|tablename|rootpartition" > "$FIND_OUT2"
echo "" > "$GEN_OUT1"
echo "" > "$GEN_OUT2"
# the core query to find all potentially impacted tables
SUBQUERY="
SELECT ns.nspname, cls.relname,
MIN(apo.blkdirrelid) > 0 AS hasblkdir,
pg_partition_root(cls.oid) AS rootpartition
FROM pg_attribute_encoding enc
JOIN pg_attribute a
ON a.attrelid = enc.attrelid AND a.attnum = enc.attnum
JOIN pg_class cls
ON enc.attrelid = cls.oid
JOIN pg_namespace ns
ON cls.relnamespace = ns.oid
JOIN pg_appendonly apo
ON enc.attrelid = apo.relid
JOIN pg_am am
ON cls.relam = am.oid
WHERE am.amname = 'ao_column'
AND cls.relkind = 'r'
AND a.attisdropped = 'f'
AND a.atthasmissing = 'f'
AND enc.lastrownums is null
GROUP BY ns.nspname,cls.oid,cls.relname
HAVING COUNT(DISTINCT enc.xmin::text) > 1
"
for db in $DBS; do
echo "[*] checking database $db..."
# find tables with block directory (category A)
psql -X -d "$db" -v ON_ERROR_STOP=1 -c \
"COPY (
SELECT current_database() as dbname,
quote_ident(q.nspname) || '.' || quote_ident(q.relname) AS tablename,
rootpartition::text
FROM ($SUBQUERY)q
WHERE hasblkdir
)
TO STDOUT WITH (FORMAT CSV, DELIMITER '|', ENCODING 'UTF8')" >> $FIND_OUT1
# find tables without block directory (category B)
psql -X -d "$db" -v ON_ERROR_STOP=1 -c \
"COPY (
SELECT current_database() as dbname,
quote_ident(q.nspname) || '.' || quote_ident(q.relname) AS tablename,
rootpartition::text
FROM ($SUBQUERY)q
WHERE hasblkdir = 'f'
)
TO STDOUT WITH (FORMAT CSV, DELIMITER '|', ENCODING 'UTF8')" >> $FIND_OUT2
# generate REORGANIZE script for tables in category A
echo "\c $db" >> $GEN_OUT1
psql -X -d "$db" -v ON_ERROR_STOP=1 -At -c \
"WITH q AS ($SUBQUERY)
SELECT 'ALTER TABLE ' || quote_ident(q.nspname) || '.' || quote_ident(q.relname)|| ' SET WITH (REORGANIZE);'
FROM q
LEFT JOIN pg_class c ON q.rootpartition = c.oid
LEFT JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE hasblkdir;" >> $GEN_OUT1
# generate REORGANIZE script for tables in category B
echo "\c $db" >> $GEN_OUT2
psql -X -d "$db" -v ON_ERROR_STOP=1 -At -c \
"WITH q AS ($SUBQUERY)
SELECT 'ALTER TABLE ' || quote_ident(q.nspname) || '.' || quote_ident(q.relname)|| ' SET WITH (REORGANIZE);'
FROM q
LEFT JOIN pg_class c ON q.rootpartition = c.oid
LEFT JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE hasblkdir = 'f';" >> $GEN_OUT2
done
echo "---------------------------------------------"
echo "Done. "
echo "Tables with index appended to: $FIND_OUT1"
echo "Tables without index appended to: $FIND_OUT2"
echo "Fix commands for tables with index appended to: $GEN_OUT1"
echo "Fix commands for tables without index appended to: $GEN_OUT2"
echo "---------------------------------------------"