The performance of a PostgreSQL database can be compromised by dead tuples, since they continue to occupy space and can lead to bloat. We provided an introduction to VACUUM and bloat in an earlier blog post. Now, though, it’s time to look at autovacuum for postgres, and the internals you to know to maintain a high performance PostgreSQL database needed by demanding applications.
What is autovacuum ?
Autovacuum is one of the background utility processes that starts automatically when you start PostgreSQL. As you see in the following log, the postmaster (parent PostgreSQL process) with pid 2862 has started the autovacuum launcher process with pid 2868. To start autovacuum, you must have the parameter autovacuum set to ON. In fact, you should not set it to OFF in a production system unless you are 100% sure about what you are doing and its implications.
avi@percona:~$ps -eaf | egrep "/post|autovacuum" postgres 2862 1 0 Jun17 pts/0 00:00:11 /usr/pgsql-10/bin/postgres -D /var/lib/pgsql/10/data postgres 2868 2862 0 Jun17 ? 00:00:10 postgres: autovacuum launcher process postgres 15427 4398 0 18:35 pts/1 00:00:00 grep -E --color=auto /post|autovacuum
Why is autovacuum needed ?
We need VACUUM to remove dead tuples, so that the space occupied by dead tuples can be re-used by the table for future inserts/updates. To know more about dead tuples and bloat, please read our previous blog post. We also need ANALYZE on the table that updates the table statistics, so that the optimizer can choose optimal execution plans for an SQL statement. It is the autovacuum in postgres that is responsible for performing both vacuum and analyze on tables.
There exists another background process in postgres called
Stats Collector that tracks the usage and activity information. The information collected by this process is used by autovacuum launcher to identify the list of candidate tables for autovacuum. PostgreSQL identifies the tables needing vacuum or analyze automatically, but only when autovacuum is enabled. This ensures that postgres heals itself and stops the database from developing more bloat/fragmentation.
Parameters needed to enable autovacuum in PostgreSQL are :
autovacuum = on # ( ON by default ) track_counts = on # ( ON by default )
track_countsis used by the stats collector. Without that in place, autovacuum cannot access the candidate tables.
Eventually, you may want to log the tables on which autovacuum spends more time. In that case, set the parameter
log_autovacuum_min_duration to a value (defaults to milliseconds), so that any autovacuum that runs for more than this value is logged to the PostgreSQL log file. This may help tune your table level autovacuum settings appropriately.
# Setting this parameter to 0 logs every autovacuum to the log file. log_autovacuum_min_duration = '250ms' # Or 1s, 1min, 1h, 1d
Here is an example log of autovacuum vacuum and analyze
< 2018-08-06 07:22:35.040 EDT > LOG: automatic vacuum of table "vactest.scott.employee": index scans: 0 pages: 0 removed, 1190 remain, 0 skipped due to pins, 0 skipped frozen tuples: 110008 removed, 110008 remain, 0 are dead but not yet removable buffer usage: 2402 hits, 2 misses, 0 dirtied avg read rate: 0.057 MB/s, avg write rate: 0.000 MB/s system usage: CPU 0.00s/0.02u sec elapsed 0.27 sec < 2018-08-06 07:22:35.199 EDT > LOG: automatic analyze of table "vactest.scott.employee" system usage: CPU 0.00s/0.02u sec elapsed 0.15 sec
When does PostgreSQL run autovacuum on a table ?
As discussed earlier, autovacuum in postgres refers to both automatic VACUUM and ANALYZE and not just VACUUM. An automatic vacuum or analyze runs on a table depending on the following mathematic equations.
The formula for calculating the effective table level autovacuum threshold is :
Autovacuum VACUUM thresold for a table = autovacuum_vacuum_scale_factor * number of tuples + autovacuum_vacuum_threshold
With the equation above, it is clear that if the actual number of dead tuples in a table exceeds this effective threshold, due to updates and deletes, that table becomes a candidate for autovacuum vacuum.
Autovacuum ANALYZE threshold for a table = autovacuum_analyze_scale_factor * number of tuples + autovacuum_analyze_threshold
The above equation says that any table with a total number of inserts/deletes/updates exceeding this threshold—since last analyze—is eligible for an autovacuum analyze.
Let’s understand these parameters in detail.
autovacuum_analyze_scale_factor: Fraction of the table records that will be added to the formula. For example, a value of 0.2 equals to 20% of the table records.
autovacuum_analyze_threshold: Minimum number of obsolete records or dml’s needed to trigger an autovacuum.
Let’s consider a table: percona.employee with 1000 records and the following autovacuum parameters.
autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 50 autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 50
Using the above mentioned mathematical formulae as reference,
Table : percona.employee becomes a candidate for autovacuum Vacuum when, Total number of Obsolete records = (0.2 * 1000) + 50 = 250
Table : percona.employee becomes a candidate for autovacuum ANALYZE when, Total number of Inserts/Deletes/Updates = (0.1 * 1000) + 50 = 150
Tuning Autovacuum in PostgreSQL
We need to understand that these are global settings. These settings are applicable to all the databases in the instance. This means, regardless of the table size, if the above formula is reached, a table is eligible for autovacuum vacuum or analyze.
Is this a problem ?
Consider a table with ten records versus a table with a million records. Even though the table with a million records may be involved in transactions far more often, the frequency at which a vacuum or an analyze runs automatically could be greater for the table with just ten records.
Consequently, PostgreSQL allows you to configure individual table level autovacuum settings that bypass global settings.
ALTER TABLE scott.employee SET (autovacuum_vacuum_scale_factor = 0, autovacuum_vacuum_threshold = 100);
Output Log ---------- avi@percona:~$psql -d percona psql (10.4) Type "help" for help. percona=# ALTER TABLE scott.employee SET (autovacuum_vacuum_scale_factor = 0, autovacuum_vacuum_threshold = 100); ALTER TABLE
The above setting runs autovacuum vacuum on the table scott.employee only once there is more than 100 obsolete records.
How do we identify the tables that need their autovacuum settings tuned ?
In order to tune autovacuum for tables individually, you must know the number of inserts/deletes/updates on a table for an interval. You can also view the postgres catalog view : pg_stat_user_tables to get that information.
percona=# SELECT n_tup_ins as "inserts",n_tup_upd as "updates",n_tup_del as "deletes", n_live_tup as "live_tuples", n_dead_tup as "dead_tuples" FROM pg_stat_user_tables WHERE schemaname = 'scott' and relname = 'employee'; inserts | updates | deletes | live_tuples | dead_tuples ---------+---------+---------+-------------+------------- 30 | 40 | 9 | 21 | 39 (1 row)
As observed in the above log, taking a snapshot of this data for a certain interval should help you understand the frequency of DMLs on each table. In turn, this should help you with tuning your autovacuum settings for individual tables.
How many autovacuum processes can run at a time ?
There cannot be more than
autovacuum_max_workers number of autovacuum processes running at a time, across the instance/cluster that may contain more than one database. Autovacuum launcher background process starts a worker process for a table that needs a vacuum or an analyze. If there are four databases with autovacuum_max_workers set to 3, then, the 4th database has to wait until one of the existing worker process gets free.
Before starting the next autovacuum, it waits for
autovacuum_naptime, the default is 1 min on most of the versions. If you have three databases, the next autovacuum waits for 60/3 seconds. So, the wait time before starting next autovacuum is always (autovacuum_naptime/N) where N is the total number of databases in the instance.
Does increasing autovacuum_max_workers alone increase the number of autovacuum processes that can run in parallel ?
NO. This is explained better in next few lines.
Is VACUUM IO intensive?
Autovacuum can be considered as a cleanup. As discussed earlier, we have 1 worker process per table. Autovacuum reads 8KB (default block_size) pages of a table from disk and modifies/writes to the pages containing dead tuples. This involves both read and write IO. Thus, this could be an IO intensive operation, when there is an autovacuum running on a huge table with many dead tuples, during a peak transaction time. To avoid this issue, we have a few parameters that are set to minimize the impact on IO due to vacuum.
The following are the parameters used to tune autovacuum IO
- autovacuum_vacuum_cost_limit : total cost limit autovacuum could reach (combined by all autovacuum jobs).
- autovacuum_vacuum_cost_delay : autovacuum will sleep for these many milliseconds when a cleanup reaching autovacuum_vacuum_cost_limit cost is done.
- vacuum_cost_page_hit : Cost of reading a page that is already in shared buffers and doesn’t need a disk read.
- vacuum_cost_page_miss : Cost of fetching a page that is not in shared buffers.
- vacuum_cost_page_dirty : Cost of writing to each page when dead tuples are found in it.
Default Values for the parameters discussed above. ------------------------------------------------------ autovacuum_vacuum_cost_limit = -1 (So, it defaults to vacuum_cost_limit) = 200 autovacuum_vacuum_cost_delay = 20ms vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20
Consider autovacuum VACUUM running on the table percona.employee.
Let’s imagine what can happen in 1 second. (1 second = 1000 milliseconds)
In a best case scenario where read latency is 0 milliseconds, autovacuum can wake up and go for sleep 50 times (1000 milliseconds / 20 ms) because the delay between wake-ups needs to be 20 milliseconds.
1 second = 1000 milliseconds = 50 * autovacuum_vacuum_cost_delay
Since the cost associated per reading a page in shared_buffers is 1, in every wake up 200 pages can be read, and in 50 wake-ups 50*200 pages can be read.
If all the pages with dead tuples are found in shared buffers, with an autovacuum_vacuum_cost_delay of 20ms, then it can read: ((200 /
vacuum_cost_page_hit) * 8) KB in each round that needs to wait for
autovacuum_vacuum_cost_delay amount of time.
Thus, at the most, an autovacuum can read : 50 * 200 * 8 KB = 78.13 MB per second (if blocks are already found in shared_buffers), considering the block_size as 8192 bytes.
If the blocks are not in shared buffers and need to fetched from disk, an autovacuum can read : 50 * ((200 /
vacuum_cost_page_miss) * 8) KB = 7.81 MB per second.
All the information we have seen above is for read IO.
Now, in order to delete dead tuples from a page/block, the cost of a write operation is :
vacuum_cost_page_dirty, set to 20 by default.
At the most, an autovacuum can write/dirty : 50 * ((200 /
vacuum_cost_page_dirty) * 8) KB = 3.9 MB per second.
Generally, this cost is equally divided to all the
autovacuum_max_workers number of autovacuum processes running in the Instance. So, increasing the
autovacuum_max_workers may delay the autovacuum execution for the currently running autovacuum workers. And increasing the
autovacuum_vacuum_cost_limit may cause IO bottlenecks. An important point to note is that this behaviour can be overridden by setting the storage parameters of individual tables, which would subsequently ignore the global settings.
postgres=# alter table percona.employee set (autovacuum_vacuum_cost_limit = 500); ALTER TABLE postgres=# alter table percona.employee set (autovacuum_vacuum_cost_delay = 10); ALTER TABLE postgres=# postgres=# \d+ percona.employee Table "percona.employee" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | Options: autovacuum_vacuum_threshold=10000, autovacuum_vacuum_cost_limit=500, autovacuum_vacuum_cost_delay=10
Thus, on a busy OLTP database, always have a strategy to implement manual VACUUM on tables that are frequently hit with DMLs, during a low peak window. You may have as many parallel vacuum jobs as possible when you run it manually after setting relevant autovacuum_* settings. For this reason, a scheduled manual Vacuum Job is always recommended alongside finely tuned autovacuum settings.
The post Tuning Autovacuum in PostgreSQL and Autovacuum Internals appeared first on Percona Database Performance Blog.