In an exercise earlier in this lesson, you learned that
DELETE statements will not reduce the space a table occupies on disk. Because of the way that
VACUUM works, even when using
VACUUM, there’s no guarantee that all the deleted tuples will be returned to disk. So far, the only way that we’ve discussed to completely reduce the disk usage of a table to the minimum required space is by calling
VACUUM FULL. However, in some special cases, a
VACUUM FULL can be avoided.
Occasionally, you may need to remove all the rows, but retain the structure of a table. In this situation, it wouldn’t be ideal to drop the table and re-populate it with data, so you might opt for what’s called an unqualified delete — a
DELETE that affects all rows (e.g.
DELETE * FROM table WHERE true;). In large tables, an unqualified delete can be quite expensive, as it still requires scanning all rows.
In these situations, one common solution is to use
TRUNCATE quickly removes all rows from a table. It has the same effect as an unqualified delete, but since PostgreSQL doesn’t scan through the table first,
TRUNCATE runs much faster on large tables. Finally,
TRUNCATE simultaneously reclaims disk space immediately, rather than requiring a subsequent
VACCUM FULL operation.
In the event that you need to clear a table, the following two sets of statements are equivalent, however the
TRUNCATE uses fewer system resources and packs the logic of multiple statements into one. Consider the following example from a table with a 500MB table containing ~12 million randomly generated values.
SELECT pg_size_pretty( pg_total_relation_size('rand') ) as total_size;
The following two queries take about 20 seconds total, ~17.5s to scan the table and perform the
DELETE and another 2.5s to
VACUUM FULL the dead tuples the delete left behind.
DELETE * FROM rand WHERE true; VACUUM FULL rand;
On the other hand, when we use
TRUNCATE rand instead, the query runs for about 150 milliseconds.
SELECT pg_size_pretty(pg_table_size('rand')) as table_size, pg_size_pretty(pg_indexes_size('rand')) as index_size, pg_size_pretty(pg_total_relation_size('rand')) as total_size;
|0 bytes||0 bytes||0 bytes|
We’re left with the same result using both
DELETE, however the
TRUNCATE statement used far fewer system resources and completed far faster than the
SELECT pg_size_pretty( pg_total_relation_size('mock.current_day_logins') ) as total_size;
Write the SQL to
Write a SQL statement that returns the disk utilization of the table’s data, the table’s indexes, and the total relation size in three separate columns. You don’t need to worry about ordering or naming the columns in your query. Are these results what you’d expect?