12c

Fetch First Rows Just Got Faster

Many applications need to paginate rows fetched from the database, or at least retrieve the first N rows. In most cases the data needs to be returned in some kind of order too. If you are an old-school developer, then you are likely to use a variation on this theme:

select * from
( select *
from the_table
order by object_id )
where rownum <= 10;

It’s not pretty, but it is effective. In fact, it is very effective if the column (or columns) in the ORDER BY have a b-tree index. Oracle reads the index entries in order so that it can avoid having to sort the entire result set. This can speed things up very considerably.

If you are new-school, then this is what you probably use instead:

select *
from the_table
order by object_id
fetch first 10 rows only;

This is much prettier, but I’m afraid it has not always been as effective. The snag has been that the optimizer did not always cost this type of query correctly. This meant that the index access ‘trick’ was sometimes missed, potentially resulting in a large sort.

Now for the good news: a one-off patch (22174392) is available and it works with OFFSET too:

This change can have a dramatic effect on the performance for these types of queries, so it’s well worth a look if you have FETCH FIRST ROWS queries.

I’ve uploaded a demo with before and after examples to GitHub

Comments welcome!

 

 

How to Gather Optimizer Statistics Fast!

There are a number of ways to speed up the process of gathering optimizer statistics, but I’m not sure that it’s common knowledge just how much of an effect some simple changes can make. If you have been asking yourself, “why is stats gathering taking so long and what can I do about it?”, then this post is for you.

If you are already familiar with the different methods of gathering optimizer statistics, you might want to jump to the end of this post where I compare them and make some recommendations.

Overview

The perception that it’s difficult to speed up statistics gathering has sometimes motivated DBAs to manipulate the number of rows sampled on a table-by-table basis using the ESTIMATE_PERCENT parameter (or DBMS_STATS preference). For example, large tables may have estimate percent set to 1% and small tables, 100%. Legacy scripts play their part too: some systems are still using procedures established before the performance enhancements available with auto sample size. One of the reasons we recommend ESTIMATE_PERCENT=>AUTO_SAMPLE_SIZE is that it includes number of distinct value (NDV) optimizations that yield high performance and accurate statistics. Gathering statistics using a 1% sample of rows might complete very quickly, but inaccurate statistics are the likely result, along with sub-optimal SQL execution plans.

Instead of manipulating ESTIMATE_PERCENT, the time taken to gather statistics can be reduced by using more machine resources. This post compares some before-and-after scenarios to demonstrate how you can do this. Fully worked examples are available in GitHub.

I will concentrate on using automatic optimizer statistics gathering, but the lessons are broadly applicable to manual statistics gathering too (there’s an example at the end of the post). The examples are intended for use on Oracle Database 12c and Oracle Database 18c. The same techniques are applicable to Oracle Database 11g, but note that the resource consumer groups have different names in that release.

Option #1 – Default Statistics Gathering

Consider the following trace of CPU consumption over time: 

CPU, simple gather

It shows my CPU utilization while the automatic statistics gathering job is running and there’s not much else happening on the system. Notice that about 75% of the CPU is not utilized. This fact is easy to understand once you know that the environment has a 4-core CPU with one thread per core. By default, statistics gathering uses a single process (with a single worker-thread) and this will utilize the processing power of a single CPU core. In my case, this equates to a utilization of 25% (one quarter of the 4-core chip). For systems with a higher core count, the single process will utilize an even smaller proportion of the available CPU.

Gathering statistics like this is not necessarily a problem. If stats gathering runs to completion most nights and there’s no urgent need to have fresh statistics by a certain time, then there’s no need to do anything more. Always keep things as simple as possible and only make changes if you need to. If your environment is large and/or volatile, the auto statistics job might regularly fail to run to completion in the batch window. In other words, the window might close before all tables considered stale have fresh statistics. If this is the case, then some tables might remain stale for a long time.

Fortunately, this situation is easy to see. If you view the statistics advisor report available in Oracle Database 12c Release 2, then it will tell you. The data dictionary stored this information too. In the example below, my batch window is 20 minutes long and the auto stats job has sometimes failed to complete (status STOPPED). The JOB_INFO column reveals the reason: auto statistics collection is occasionally taking longer than 20 minutes and terminates when the batch window closes.

Task History

How can we fix this? We could (a) make the batch window longer and/or (b) speed up statistics gathering. I am going to consider option b (because option a is less interesting).

How do you speed up statistics gathering? If you have resources on your database server, then you could dedicate more of it to gather statistics. You can reduce the elapsed time of gathering statistics at the cost of a more fully utilized database server. It is of course necessary to identify a window of time where there’s spare system resource, so this solution requires that the system is not running at 100% all of the time.

It is worth noting that other techniques are available to reduce the time required to maintain statistics (such as incremental statistics maintenance), but this is out of scope for the purposes of this blog post.

Option #2 – Gathering Statistics in Parallel – AUTO_DEGREE

Gathering statistics with auto sample size initiates full table scans to inspect table data. We can leverage parallel execution to make these scans complete in less time. To do this you can, for example, identify large tables and define a specific degree of parallelism (DOP):

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, 16)

There is an easier, set-and-forget approach where you can let Oracle to decide on the DOP for you:

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

A clean and simple approach is to set the property at the global level:

exec dbms_stats.set_global_prefs(‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

With parallel execution in play, statistics gathering has the potential to consume lots of system resource, so you need to consider how to control this. When the auto stats gathering job executes it (by default) uses the resource management plan DEFAULT_MAINTENANCE_PLAN and a consumer group called ORA$AUTOTASK. This makes it very easy to make some adjustments and control just how much resource you want to dedicate to gathering statistics.

You first need to decide what priority to attribute to auto stats gathering depending on what other processes are likely to be running at the same time. In the following example, the auto stats job has a minimum of 5% CPU if other tasks running in the database are competing for CPU. However, if the system is not busy, we will allow the job to consume up to 80% of the CPU (this will protect processes that must run outside the control of the database). The maximum degree of parallelism an individual session can use is four in this case. It is useful to control the maximum DOP because you will want to make sure that you do not reach the maximum number of parallel server processes allowed for the system (this will become more relevant later in this post).

RM Plan

For completeness, the example above includes all plan directives for the DEFAULT_MAINTENANCE_PLAN, but it is only necessary to specify the plan directives you want to modify. By default, when the maintenance windows opens, it will activate the DEFAULT_MAINTENANCE_PLAN. If you prefer, you can create your own resource management plan and associate it with any maintenance windows of your choosing. If you also set the resource_management_plan initialization parameter, then you can use the same resource management plan when the batch windows are both open and closed. Here’s an example:

Bespoke RM Plan

When AUTO_DEGREE is used the resource utilization can look very different . In this example, the tables are all identical so there’s a very regular pattern:

CPU auto DOP

We are now using much more CPU, and consequently the job completes in only 12 minutes and 26 seconds (where, previously, it failed to complete within the 20-minute window):

Job status

Remember that database resource management (DBRM) is in force during the batch window, so it is very easy to adjust CPU utilization even while the job is running. For example – consider what happens when I adjust the utilization limit down from 80% to 40% and then back again:

RM CPU

Let’s look at a more realistic AUTO_DEGREE scenario. In the following example we have a schema containing tables that have a wide variation in size. The CPU profile is now less consistent:

Auto DOP

The DOP is changing in response to the size of each individual table. The job runs serially at first (about 25% CPU), then DOP 2 for a while, then DOP 3 and then back to serial. We could micro-manage DOP on a table-by-table basis, but it is much better to avoid approaches like this because we should always aim to avoid too much manual intervention. The global AUTO_DEGREE solution will be good enough in many cases, so there will be no need for any further manual intervention.  

Option #3 – Gathering Statistics Concurrently – CONCURRENT

Parallel statistics gathering has enabled us to increase CPU utilization significantly, but what if we have spare machine resources and want to go even faster? In the previous example, the CPU could be more fully utilized. If you want to achieve that, then how do you go about it?

Firstly, disable parallel execution (we will come back to that later):

exec dbms_stats.set_global_prefs(‘DEGREE’, 1)

The CONCURRENT preference allows DBMS_SCHEDULER to initiate multiple statistics gathering jobs at once, so that the database will gather statistics on multiple tables and partitions concurrently. We can choose to enable this behavior for auto stats gathering only:

exec dbms_stats.set_global_prefs(‘CONCURRENT’,’AUTOMATIC’)

The database will now gather statistics using multiple scheduler jobs. In my case, auto stats initiated 16 job processes and the CPU profile looked like this:

CPU Conc

I am using max_utilization_limit set to 80%, and the job completes in 11 minutes and 39 seconds:

Concurrent Results

Concurrent statistics gathering works very well if tables are of a similar size, but without parallel execution, serial jobs running on very large tables can take a long time and the might not complete before the batch window closes. If this is a problem for you, you can use a combination of concurrent processing and parallel execution.

Option #4 – Gathering Statistics Concurrently and in Parallel – CONCURRENT and AUTO_DEGREE

Care is required when implementing concurrency and parallel execution because there’s scope to execute a very large number of concurrent parallel execution servers and generate a very high system load. Multiple jobs will start and each has the potential to initiate a number of parallel query servers. As a very general rule of thumb, you want to have no more than about 2*CPUCoreCount to 4*CPUCoreCount parallel servers executing at any one time.

You can mitigate the risk of initiating too many parallel execution servers as follows:

Concurrent and Auto DOP

There is currently no way to cap the number of job queue processes allocated to concurrent stats gathering, so ‘turning down’ the job_queue_processes setting is the only way to do this. I have created an enhancement request with respect to this limitation.

Enable concurrent stats gathering for the automatic statistics gathering job:

exec dbms_stats.set_global_prefs(‘CONCURRENT’,’AUTOMATIC’)

Set AUTO_DEGREE globally:

exec dbms_stats.set_global_prefs(‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

Or, for individual large tables:

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

On my system, the auto stats initiated 16 job processes and the resource management plan I used limited DOP to four. This resulted in 64 parallel execution servers executing concurrently, so a DOP limited to two might have been a better choice in my case. Here is the new CPU profile:

CPU Concurrent and Auto DOP

The job completed in 13 minutes 16 seconds:

Concurrent + Auto DOP results

In this case, why was there no benefit in run time using CONCURRENT and AUTO_DEGREE? It is because CONCURRENT without AUTO_DEGREE consumed CPU up to the 80% limit imposed by DBRM. In the more general case where there is a good mix of table sizes, some very large tables and a faster IO subsystem, then AUTO_DEGREE used in combination with CONCURRENT has the potential to yield the shortest gather stats times.

If you are in a position to be able to decrease job_queue_processes to limit the number of jobs that execute concurrently, then you will be able to increase the DOP limit to a higher values: 

Higher DOP

The same is true if there are only a small number of tables where parallelism is used.

Oracle Multitenant

At the time of writing there’s a bug with parallel statistics gathering in a multitenant database when used with CONCURRENT (unpublished bug# 27249531). Parallel execution servers initiated by gather stats are not constrained by max_utilization_limit. This can result in high CPU consumption. Using DEGREE above 1 or AUTO_DEGREE is OK if CONCURRENT is not used.

Until a fix is available for your platform and version, the best solution is to use DEGREE=>1 if you want to use CONCURRENT in multitenant environments.

Manual Statistics Gathering

If you want to initiate stats gathering manually, and still make full use of parallel and concurrent settings, then you can use the following approach:

Manual Example

Performance Comparisons

A small test system was used for the examples above, so it will be useful to see what an enterprise-class system looks like (let’s say 72 cores with HT). The Oracle Real World Performance Group ran some tests to check out the different techniques.

The relative performance of the stats gathering methods will be different on every system you try, so treat this as entertainment rather than science. For example, the test tables were all large and all the same size, so this will work in favor of AUTO_DEGREE (used without CONCURRENT) because a high degree of parallism was used for every table.

A large number of CPU cores will make the default method look exceptionally underpowered. In this case the CPU is only 1.5% utilized; a single core in an enterprise-class system:

Performance Test Results

Here are the results plotted:

Plots

Summary

Remember that you will need spare machine capacity to gain benefit from the techniques outlined in this blog post.

Generally speaking, option #2 is most likely to give you a quick and easy win if there are a number of very large tables. Option #3 is great if you have plenty of spare machine resource and a large number of smaller tables.

Option #4 requires more care to avoid initiating too many parallel execution servers.

Options #3 and #4 are particularly useful if you need to get a one-off stats gathering task done very quickly: perhaps when you are commissioning a new deployment or gathering statistics after an upgrade.

Here’s a high-level summary:

Summary

I’ve uploaded self-contained test scripts to GitHub.

Comments welcome!

How does AUTO_SAMPLE_SIZE work in Oracle Database 12c?

The ESTIMATE_PERCENT parameter in DBMS_STATS.GATHER_*_STATS procedures controls the percentage of rows to sample when gathering optimizer statistics. What percentage of rows should you sample to achieve accurate statistics? 100% will ensure that statistics are accurate, but it could take a long time. A 1% sample will finish much more quickly but it could result in poor statistics. It’s not an easy question to answer, which is why it is best practice to use the default: AUTO_SAMPLE_SIZE.

In this post, I’ll cover how the AUTO_SAMPLE_SIZE algorithm works in Oracle Database 12c and how it affects the accuracy of the statistics being gathered. If you want to learn more of the history prior to Oracle Database 12c, then this post on Oracle Database 11g is a good place to look. I will indicate below where there are differences between Oracle Database 11g and Oracle Database 12c.

It’s not always appreciated that (in general) a large proportion of the time and resource cost required to gather statistics is associated with evaluating the number of distinct values (NDVs) for each column. Calculating NDV using an exact algorithm can be expensive because the database needs to record and sort column values while statistics are being gathered. If the NDV is high, retaining and sorting column values can become resource-intensive, especially if the sort spills to TEMP. Auto sample size instead uses an approximate (but accurate) algorithm to calculate NDV that avoids the need to sort column data or spill to TEMP. In return for this saving, the database can afford to use a full table scan to ensure that the other basic column statistics are accurate.

Similarly, it can be resource-intensive to generate histograms but the Oracle Database mitigates this cost as follows:

  • Frequency and top frequency histograms are created as the database gathers basic column statistics (such as NDV, MIN, MAX) from the full table scan mentioned above. This is new to Oracle Database 12c.
  • If a frequency or top frequency histogram is not feasible, then the database will collect hybrid histograms using a sample of the column data. Top frequency is only feasible when the top 254 values constitute more than 99% of the entire non null column values and frequency is only feasible if NDV is 254 or less.
  • When the user has specified ‘SIZE AUTO’ in the METHOD_OPT clause for automatic histogram creation, the Oracle Database chooses which columns to consider for histogram creation based column usage data that’s gathered by the optimizer. Columns that are not used in WHERE-clause predicates or joins are not considered for histograms.

Both Oracle Database 11g and Oracle Database 12c use the following query to gather basic column statistics (it is a simplified here for illustrative purposes).

SELECT COUNT(c1), MIN(c1), MAX(c1)
FROM  t;

The query reads the table (T) and scans all rows (rather than using a sample). The database also needs to calculate the number of distinct values (NDV) for each column but the query does not use COUNT(DISTINCT c1) and so on, but instead, during execution,  a special statistics gathering row source is injected into the query. The statistics gathering row source uses a one-pass, hash-based distinct algorithm to gather NDV. The algorithm requires a full scan of the data, uses a bounded amount of memory and yields a highly accurate NDV that is nearly identical to a 100 percent sampling (a fact that can be proven mathematically). The statistics gathering row source also gathers the number of rows, number of nulls and average column length. Since a full scan is used, the number of rows, average column length, minimum and maximum values are 100% accurate.

Effect of auto sample size on histogram gathering

Hybrid histogram gathering is decoupled from basic column statistics gathering and uses a sample of column values. This technique was used in Oracle Database 11g to build height-balanced histograms. More information on this can be found in this blog post. Oracle Database 12c replaced height-balanced histograms with hybrid histograms.

Effect of auto sample size on index stats gathering

AUTO_SAMPLE_SIZE affects how index statistics are gathered. Index statistics gathering is sample-based and it can potentially go through several iterations if the sample contains too few blocks or the sample size was too small to properly gather number of distinct keys (NDKs). The algorithm has not changed since Oracle Database 11g, so I’ve left it to the previous blog to go more detail. There one other thing to note:

At the time of writing, there are some cases where index sampling can lead to NDV mis-estimates for composite indexes. The best work-around is to create a column group on the relevant columns and use gather_table_stats. Alternatively, there is a one-off fix – 27268249. This patch changes the way NDV is calculated for indexes on large tables (and no column group is required). It is available for 12.2.0.1 at the moment, but note that it cannot be backported. As you might guess, it’s significantly slower than index block sampling, but it’s still very fast. At the time of writing, if you find a case where index NDV is causing an issue with a query plan, then the recommended approach is to add a column group rather than attempting to apply this patch.

Summary:

Note that top frequency and hybrid histograms are new to Oracle Database 12c. Oracle Database 11g had frequency and height-balanced histograms only. Hybrid histograms replaced height-balanced histograms.

  1. The auto sample size algorithm uses a full table scan (a 100% sample) to gather basic column statistics.
  2. The cost of a full table scan (verses row sampling) is mitigated by the approximate NDV algorithm, which eliminates the need to sort column data.
  3. The approximate NDV gathered by AUTO_SAMPLE_SIZE is close to the accuracy of a 100% sample.
  4. Other basic column statistics, such as the number of nulls, average column length, minimal and maximal values have an accuracy equivalent to 100% sampling.
  5. Frequency and top frequency histograms are created using a 100%* sample of column values and are created when basic column statistics are gathered. This is different to Oracle Database 11g, which decoupled frequency histogram creation from basic column statistics gathering (and used a sample of column values).
  6. Hybrid histograms are created using a sample of column values. Internally, this step is decoupled from basic column statistics gathering.
  7. Index statistics are gathered using a sample of column values. The sample size is determined automatically.

*There is an exception to case 5, above. Frequency histograms are created using a sample if OPTIONS=>’GATHER AUTO’ is used after a bulk load where statistics have been gathered using online statistics gathering.

Oracle 12c Release 2 Multitenancy New Features

This note describes the new features related to the Oracle Database Multitenant option which have been introduced in Oracle 12c Release 2 (12.2.0.1). The hands-on exercises included in the note demonstrates the use of the following new multitenancy features: Hot Cloning Refreshable Pluggable Databases Support for pluggable databases with multiple You need to be logged […]

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569