Oracle Optimizer

Oracle Database 19c and SQL Plan Management Diagnostics

A popular enhancement request I see is to provide an easier way to diagnose issues with SQL plan baselines; in particular the situation where a SQL plan baseline is ACCEPTED but is not being used for whatever reason. This is rare, but can happen if changes are made to the database such as dropping indexes or changing partitioning schemes. If a SQL plan baseline can’t be used, you will see something like this in Oracle Database 19c:

So why did it fail?

In this example I captured a plan that uses an index and then I made the index invisible. There’s no way the index plan can be used anymore. However, let’s pretend that we don’t know what happened.

There is now a really nice way to help diagnose issues with SQL plan baselines. It relies on Oracle Database 19c (hint usage reporting) and a hidden parameter to force the SQL statement to use the outline in the SQL plan baseline even if the resulting plan doesn’t match the SQL plan baseline (a pretend match).

This is how you do it:

alter session set “_sql_plan_management_control”=4;
explain plan for select /* MYTESTSQL */ sum(num) from mytest1 where id = 10;
select * from table(DBMS_XPLAN.DISPLAY(FORMAT=>’typical’));
alter session set “_sql_plan_management_control”=0;

And then you will see something like this:

How beautiful is that? The hint report tells us that INDEX_RS_ASC is not used – a really strong clue.

There’s a worked example on GitHub if you want to try it yourself. There’s also a spool file if you want to look at the example but don’t have access to a version 19c database at the moment. I’m afraid that LiveSQL doesn’t allow you to use SQL plan management yet.

Hey, what? Forcing the plan baseline plan?

Some of you might jump on the idea that a parameter can be used to “force a SQL plan baseline to be used” (another popular request). This is not the case! As I said, it forces the outline in the SQL plan baseline to be used even if the plan is not the one we want. The parameter is not a magic bullet that will somehow force the optimizer to use the plan in the SQL plan baseline. You can see from this example that it is just not possible: the index is not available for use.

In other words – if the outline in the SQL plan baseline can be used successfully, then it will be used. Telling the optimizer to pretend-match the outline in the SQL plan baseline won’t somehow fix the issue. This is demonstrated above – the outline is applied to the SQL statement but the hints it uses cannot be honored. The FULL plan is still used. 

 

Fetch First Rows Just Got Faster

Many applications need to paginate rows fetched from the database, or at least retrieve the first N rows. In most cases the data needs to be returned in some kind of order too. If you are an old-school developer, then you are likely to use a variation on this theme:

select * from
( select *
from the_table
order by object_id )
where rownum <= 10;

It’s not pretty, but it is effective. In fact, it is very effective if the column (or columns) in the ORDER BY have a b-tree index. Oracle reads the index entries in order so that it can avoid having to sort the entire result set. This can speed things up very considerably.

If you are new-school, then this is what you probably use instead:

select *
from the_table
order by object_id
fetch first 10 rows only;

This is much prettier, but I’m afraid it has not always been as effective. The snag has been that the optimizer did not always cost this type of query correctly. This meant that the index access ‘trick’ was sometimes missed, potentially resulting in a large sort.

Now for the good news: a one-off patch (22174392) is available and it works with OFFSET too:

This change can have a dramatic effect on the performance for these types of queries, so it’s well worth a look if you have FETCH FIRST ROWS queries.

I’ve uploaded a demo with before and after examples to GitHub

Comments welcome!

 

 

How to Gather Optimizer Statistics Fast!

There are a number of ways to speed up the process of gathering optimizer statistics, but I’m not sure that it’s common knowledge just how much of an effect some simple changes can make. If you have been asking yourself, “why is stats gathering taking so long and what can I do about it?”, then this post is for you.

If you are already familiar with the different methods of gathering optimizer statistics, you might want to jump to the end of this post where I compare them and make some recommendations.

Overview

The perception that it’s difficult to speed up statistics gathering has sometimes motivated DBAs to manipulate the number of rows sampled on a table-by-table basis using the ESTIMATE_PERCENT parameter (or DBMS_STATS preference). For example, large tables may have estimate percent set to 1% and small tables, 100%. Legacy scripts play their part too: some systems are still using procedures established before the performance enhancements available with auto sample size. One of the reasons we recommend ESTIMATE_PERCENT=>AUTO_SAMPLE_SIZE is that it includes number of distinct value (NDV) optimizations that yield high performance and accurate statistics. Gathering statistics using a 1% sample of rows might complete very quickly, but inaccurate statistics are the likely result, along with sub-optimal SQL execution plans.

Instead of manipulating ESTIMATE_PERCENT, the time taken to gather statistics can be reduced by using more machine resources. This post compares some before-and-after scenarios to demonstrate how you can do this. Fully worked examples are available in GitHub.

I will concentrate on using automatic optimizer statistics gathering, but the lessons are broadly applicable to manual statistics gathering too (there’s an example at the end of the post). The examples are intended for use on Oracle Database 12c and Oracle Database 18c. The same techniques are applicable to Oracle Database 11g, but note that the resource consumer groups have different names in that release.

Option #1 – Default Statistics Gathering

Consider the following trace of CPU consumption over time: 

CPU, simple gather

It shows my CPU utilization while the automatic statistics gathering job is running and there’s not much else happening on the system. Notice that about 75% of the CPU is not utilized. This fact is easy to understand once you know that the environment has a 4-core CPU with one thread per core. By default, statistics gathering uses a single process (with a single worker-thread) and this will utilize the processing power of a single CPU core. In my case, this equates to a utilization of 25% (one quarter of the 4-core chip). For systems with a higher core count, the single process will utilize an even smaller proportion of the available CPU.

Gathering statistics like this is not necessarily a problem. If stats gathering runs to completion most nights and there’s no urgent need to have fresh statistics by a certain time, then there’s no need to do anything more. Always keep things as simple as possible and only make changes if you need to. If your environment is large and/or volatile, the auto statistics job might regularly fail to run to completion in the batch window. In other words, the window might close before all tables considered stale have fresh statistics. If this is the case, then some tables might remain stale for a long time.

Fortunately, this situation is easy to see. If you view the statistics advisor report available in Oracle Database 12c Release 2, then it will tell you. The data dictionary stored this information too. In the example below, my batch window is 20 minutes long and the auto stats job has sometimes failed to complete (status STOPPED). The JOB_INFO column reveals the reason: auto statistics collection is occasionally taking longer than 20 minutes and terminates when the batch window closes.

Task History

How can we fix this? We could (a) make the batch window longer and/or (b) speed up statistics gathering. I am going to consider option b (because option a is less interesting).

How do you speed up statistics gathering? If you have resources on your database server, then you could dedicate more of it to gather statistics. You can reduce the elapsed time of gathering statistics at the cost of a more fully utilized database server. It is of course necessary to identify a window of time where there’s spare system resource, so this solution requires that the system is not running at 100% all of the time.

It is worth noting that other techniques are available to reduce the time required to maintain statistics (such as incremental statistics maintenance), but this is out of scope for the purposes of this blog post.

Option #2 – Gathering Statistics in Parallel – AUTO_DEGREE

Gathering statistics with auto sample size initiates full table scans to inspect table data. We can leverage parallel execution to make these scans complete in less time. To do this you can, for example, identify large tables and define a specific degree of parallelism (DOP):

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, 16)

There is an easier, set-and-forget approach where you can let Oracle to decide on the DOP for you:

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

A clean and simple approach is to set the property at the global level:

exec dbms_stats.set_global_prefs(‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

With parallel execution in play, statistics gathering has the potential to consume lots of system resource, so you need to consider how to control this. When the auto stats gathering job executes it (by default) uses the resource management plan DEFAULT_MAINTENANCE_PLAN and a consumer group called ORA$AUTOTASK. This makes it very easy to make some adjustments and control just how much resource you want to dedicate to gathering statistics.

You first need to decide what priority to attribute to auto stats gathering depending on what other processes are likely to be running at the same time. In the following example, the auto stats job has a minimum of 5% CPU if other tasks running in the database are competing for CPU. However, if the system is not busy, we will allow the job to consume up to 80% of the CPU (this will protect processes that must run outside the control of the database). The maximum degree of parallelism an individual session can use is four in this case. It is useful to control the maximum DOP because you will want to make sure that you do not reach the maximum number of parallel server processes allowed for the system (this will become more relevant later in this post).

RM Plan

For completeness, the example above includes all plan directives for the DEFAULT_MAINTENANCE_PLAN, but it is only necessary to specify the plan directives you want to modify. By default, when the maintenance windows opens, it will activate the DEFAULT_MAINTENANCE_PLAN. If you prefer, you can create your own resource management plan and associate it with any maintenance windows of your choosing. If you also set the resource_management_plan initialization parameter, then you can use the same resource management plan when the batch windows are both open and closed. Here’s an example:

Bespoke RM Plan

When AUTO_DEGREE is used the resource utilization can look very different . In this example, the tables are all identical so there’s a very regular pattern:

CPU auto DOP

We are now using much more CPU, and consequently the job completes in only 12 minutes and 26 seconds (where, previously, it failed to complete within the 20-minute window):

Job status

Remember that database resource management (DBRM) is in force during the batch window, so it is very easy to adjust CPU utilization even while the job is running. For example – consider what happens when I adjust the utilization limit down from 80% to 40% and then back again:

RM CPU

Let’s look at a more realistic AUTO_DEGREE scenario. In the following example we have a schema containing tables that have a wide variation in size. The CPU profile is now less consistent:

Auto DOP

The DOP is changing in response to the size of each individual table. The job runs serially at first (about 25% CPU), then DOP 2 for a while, then DOP 3 and then back to serial. We could micro-manage DOP on a table-by-table basis, but it is much better to avoid approaches like this because we should always aim to avoid too much manual intervention. The global AUTO_DEGREE solution will be good enough in many cases, so there will be no need for any further manual intervention.  

Option #3 – Gathering Statistics Concurrently – CONCURRENT

Parallel statistics gathering has enabled us to increase CPU utilization significantly, but what if we have spare machine resources and want to go even faster? In the previous example, the CPU could be more fully utilized. If you want to achieve that, then how do you go about it?

Firstly, disable parallel execution (we will come back to that later):

exec dbms_stats.set_global_prefs(‘DEGREE’, 1)

The CONCURRENT preference allows DBMS_SCHEDULER to initiate multiple statistics gathering jobs at once, so that the database will gather statistics on multiple tables and partitions concurrently. We can choose to enable this behavior for auto stats gathering only:

exec dbms_stats.set_global_prefs(‘CONCURRENT’,’AUTOMATIC’)

The database will now gather statistics using multiple scheduler jobs. In my case, auto stats initiated 16 job processes and the CPU profile looked like this:

CPU Conc

I am using max_utilization_limit set to 80%, and the job completes in 11 minutes and 39 seconds:

Concurrent Results

Concurrent statistics gathering works very well if tables are of a similar size, but without parallel execution, serial jobs running on very large tables can take a long time and the might not complete before the batch window closes. If this is a problem for you, you can use a combination of concurrent processing and parallel execution.

Option #4 – Gathering Statistics Concurrently and in Parallel – CONCURRENT and AUTO_DEGREE

Care is required when implementing concurrency and parallel execution because there’s scope to execute a very large number of concurrent parallel execution servers and generate a very high system load. Multiple jobs will start and each has the potential to initiate a number of parallel query servers. As a very general rule of thumb, you want to have no more than about 2*CPUCoreCount to 4*CPUCoreCount parallel servers executing at any one time.

You can mitigate the risk of initiating too many parallel execution servers as follows:

Concurrent and Auto DOP

There is currently no way to cap the number of job queue processes allocated to concurrent stats gathering, so ‘turning down’ the job_queue_processes setting is the only way to do this. I have created an enhancement request with respect to this limitation.

Enable concurrent stats gathering for the automatic statistics gathering job:

exec dbms_stats.set_global_prefs(‘CONCURRENT’,’AUTOMATIC’)

Set AUTO_DEGREE globally:

exec dbms_stats.set_global_prefs(‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

Or, for individual large tables:

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

On my system, the auto stats initiated 16 job processes and the resource management plan I used limited DOP to four. This resulted in 64 parallel execution servers executing concurrently, so a DOP limited to two might have been a better choice in my case. Here is the new CPU profile:

CPU Concurrent and Auto DOP

The job completed in 13 minutes 16 seconds:

Concurrent + Auto DOP results

In this case, why was there no benefit in run time using CONCURRENT and AUTO_DEGREE? It is because CONCURRENT without AUTO_DEGREE consumed CPU up to the 80% limit imposed by DBRM. In the more general case where there is a good mix of table sizes, some very large tables and a faster IO subsystem, then AUTO_DEGREE used in combination with CONCURRENT has the potential to yield the shortest gather stats times.

If you are in a position to be able to decrease job_queue_processes to limit the number of jobs that execute concurrently, then you will be able to increase the DOP limit to a higher values: 

Higher DOP

The same is true if there are only a small number of tables where parallelism is used.

Oracle Multitenant

At the time of writing there’s a bug with parallel statistics gathering in a multitenant database when used with CONCURRENT (unpublished bug# 27249531). Parallel execution servers initiated by gather stats are not constrained by max_utilization_limit. This can result in high CPU consumption. Using DEGREE above 1 or AUTO_DEGREE is OK if CONCURRENT is not used.

Until a fix is available for your platform and version, the best solution is to use DEGREE=>1 if you want to use CONCURRENT in multitenant environments.

Manual Statistics Gathering

If you want to initiate stats gathering manually, and still make full use of parallel and concurrent settings, then you can use the following approach:

Manual Example

Performance Comparisons

A small test system was used for the examples above, so it will be useful to see what an enterprise-class system looks like (let’s say 72 cores with HT). The Oracle Real World Performance Group ran some tests to check out the different techniques.

The relative performance of the stats gathering methods will be different on every system you try, so treat this as entertainment rather than science. For example, the test tables were all large and all the same size, so this will work in favor of AUTO_DEGREE (used without CONCURRENT) because a high degree of parallism was used for every table.

A large number of CPU cores will make the default method look exceptionally underpowered. In this case the CPU is only 1.5% utilized; a single core in an enterprise-class system:

Performance Test Results

Here are the results plotted:

Plots

Summary

Remember that you will need spare machine capacity to gain benefit from the techniques outlined in this blog post.

Generally speaking, option #2 is most likely to give you a quick and easy win if there are a number of very large tables. Option #3 is great if you have plenty of spare machine resource and a large number of smaller tables.

Option #4 requires more care to avoid initiating too many parallel execution servers.

Options #3 and #4 are particularly useful if you need to get a one-off stats gathering task done very quickly: perhaps when you are commissioning a new deployment or gathering statistics after an upgrade.

Here’s a high-level summary:

Summary

I’ve uploaded self-contained test scripts to GitHub.

Comments welcome!

The Oracle Optimizer and ADWC – Hints

This is Part 3 of a series on the Oracle Optimizer in the Oracle Autonomous Data Warehouse Cloud. You can find part 1 here and part 2 here.

It’s time to take a look at optimizer hints. Here’s our test query:

select sum(t1.num), sum(t2.num)
from table1 t1
join table2 t2 on (t1.id = t2.id);

Executing on an ADW database (using the LOW consumer group) yields this plan:

—————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
—————————————————————————————-
| 0 | SELECT STATEMENT | | | | 4 (100)|
| 1 | RESULT CACHE | 86m6ud7jmfq443pumuj63z1bmd | | | |
| 2 | SORT AGGREGATE | | 1 | 52 | |
|* 3 | HASH JOIN | | 1 | 52 | 4 (0)|
| 4 | TABLE ACCESS FULL| TABLE2 | 1 | 26 | 2 (0)|
| 5 | TABLE ACCESS FULL| TABLE1 | 1000 | 26000 | 2 (0)|
—————————————————————————————-

There are of course no indexes on the table so this is the best plan (we get a single row from TABLE2 so it leads the HASH join).

I will now try to make the plan worse using a hint:   🙂

select /*+ LEADING(t1 t2) USE_NL(t2) */
sum(t1.num), sum(t2.num)
from table1 t1
join table2 t2 on (t1.id = t2.id);

This doesn’t work – the plan does not change. Take my word for it for now; there is a link to test scripts at the bottom of this post.

Autonomous Data Warehouse Cloud ignores optimizer hints and PARALLEL hints in SQL statements by default. If your application relies on them you can set OPTIMIZER_IGNORE_HINTS to FALSE at the session or system level using ALTER SESSION or ALTER SYSTEM. You can also enable PARALLEL hints in your SQL statements by setting OPTIMIZER_IGNORE_PARALLEL_HINTS to FALSE at the session or system level.

For this example, I used ALTER SESSION to give me the sub-optimal plan I wanted (TABLE1 is now the leading table and it’s a NESTED LOOPS join):

alter session set optimizer_ignore_hints = false;

select /*+ LEADING(t1 t2) USE_NL(t2) */
sum(t1.num), sum(t2.num)
from table1 t1
join table2 t2 on (t1.id = t2.id);

—————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
—————————————————————————————-
| 0 | SELECT STATEMENT | | | | 73 (100)|
| 1 | RESULT CACHE | db11srrdf8ar4d06x4b1j674pp | | | |
| 2 | SORT AGGREGATE | | 1 | 52 | |
| 3 | NESTED LOOPS | | 1 | 52 | 73 (3)|
| 4 | TABLE ACCESS FULL| TABLE1 | 1000 | 26000 | 2 (0)|
|* 5 | TABLE ACCESS FULL| TABLE2 | 1 | 26 | 0 (0)|
—————————————————————————————-

Why is ADWC set up like this? It’s pretty simple: the Oracle Optimizer’s job is to find good SQL execution plans without manual intervention. It is not the application developer’s or DBA’s job, so hints should be avoided as much as possible. Over time, they can prevent applications from taking advantage of new optimization techniques, so try and leave the heavy-lifting to the database. Think autonomous.

If you looked at  part 1 of this series, then you will know that we are careful with this restriction and allow INSERT /*+ APPEND */ by default.

To try this example for yourself, it’s uploaded to GitHub.

Comments and suggestions welcome!

The Oracle Optimizer and ADWC – Statistics-Based Query Transformation

This is Part 2 of a series on the Oracle Optimizer in the Oracle Autonomous Data Warehouse Cloud. You can find Part 1 here. In this post, I’m going to cover a clever optimization included in ADWC: Statistics-Based Query Transformation. I’m sure that you can guess what it is already, but let’s take it step-by-step.

Consider the following query:

SELECT max(sale_value)
FROM    huge_fact_table;

To find MAX, the Oracle Database would normally need to scan all of the rows in the huge fact table to find the maximum value (I’m assuming that there’s no index on SALE_VALUE). This scan can be very time-consuming, but there is a potential short-cut. When statistics are gathered on a table the database retains information on how may rows there are as well as details of column data such minimum and maximum values and the number of distinct values (NDV). If the database is sure that statistics are up-to-date, then it is clearly in a position to answer some queries using the information it has already gathered.

Here is an example I executed on ADWC using the HIGH consumer group:

select max(num0),min(num1) from fact1;

—————————————————————
| Id | Operation | Name |
—————————————————————
| 0 | SELECT STATEMENT | |
| 1 | RESULT CACHE | cpuu29wy174jfbu3a7zcxzkw12 |
| 2 | VIEW | VW_SQT_65BBF4BE |
| 3 | SORT AGGREGATE | |
| 4 | PX COORDINATOR | |
| 5 | PX SEND QC (RANDOM) | :TQ10000 |
| 6 | SORT AGGREGATE | |
| 7 | PX BLOCK ITERATOR | |
|* 8 | TABLE ACCESS FULL| FACT1 |
—————————————————————

The first thing of note is that we use the server result cache, but I’m going to stick to the topic at hand an concentrate on operation ID 2: VW_SQT_65BBF4BE. This tells us that the statistics-based query transformation has kicked in and, potentially, the values of max(num0) and max(num1) can be retrieved from the statistics stored in the data dictionary. We are in a position to avoid the need to scan FACT1.

The word ‘potentially’ might sound rather vague, but bear in mind that the query transformation is established when the query is parsed, so when it is executed it is possible that the data in FACT1 has been updated in some way. This will prevent us from being able to use the dictionary statistics because the data might no longer match the statistics. In ADWC, the easiest way to see that statistics have actually been used is to observe the much-reduced query response time.

As you might expect, there are restrictions regarding when this transformation can and cannot be used. I’ve included some queries in the GitHub repository that use the transformation and some that don’t. Hopefully, this will be enough to get you started if you want to explore the boundaries. There’s also a spool file if you can’t run the example yourself.

Part 3 of this series is here.

As ever, comments welcome!

 

 

 

The Oracle Optimizer and ADWC – Statistics and Bulk Load

It’s time for some posts on the Oracle Autonomous Data Warehouse Cloud and the enhancements we’ve made to Statistics Management and the Oracle Optimizer. This is Part 1, and it covers statistics maintenance during bulk load. I’ll add links to the other posts when they become available. 

My scripts on the topic of autonomous are stored in GitHub here. The scripts for this post are here.

Statistics and Bulk Loading

Consider an empty fact table called FACT1. Let’s populate it with 10,000 rows using a bulk transform and load operation from FACT1_SOURCE:

SQL> insert /*+ APPEND */ into fact1 select num0,1,txt1 from fact1_source;
10,000 rows inserted.
Elapsed: 00:00:00.519

— Take a look at stats…

select table_name,num_rows,sample_size,stale_stats
from user_tab_statistics where table_name = ‘FACT1’;

TABLE_NAME NUM_ROWS SAMPLE_SIZE STALE
—————————— ———- ———– —–
FACT1 10000 10000 NO

We can see that there are valid statistics on FACT1, and this will come as no surprise if you are familiar with online statistics gathering

Now let’s bulk load another 10,000 rows…

SQL> insert /*+ APPEND */ into fact1 select num0,2,txt1 from fact1_source;
10,000 rows inserted.
Elapsed: 00:00:00.414

TABLE_NAME NUM_ROWS SAMPLE_SIZE STALE
—————————— ———- ———– —–
FACT1 20000 20000 NO

This is new behavior: ADWC maintains statistics on each subsequent direct path load. The demo script continues to insert more rows (from FACT1 so that we rapidly increase the row count) until finally…

SQL> insert /*+ APPEND */ into fact1 select num0,16,txt1 from fact1;
81,920,000 rows inserted.
Elapsed: 00:00:49.970

TABLE_NAME NUM_ROWS SAMPLE_SIZE STALE
—————————— ———- ———– —–
FACT1 163840000 163840000 NO

What about histograms?

SQL> select table_name,column_name,low_value,high_value,sample_size,histogram
  from user_tab_col_statistics
  where table_name = ‘FACT1’;

TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE HISTOGRAM
———– ———— ———- ————— ———– —————
FACT1 NUM0 C102 C302 163840000 HYBRID
FACT1 NUM1 C102 C111 163840000 FREQUENCY
FACT1 TXT1 58585831 58585839393939 163840000 HYBRID

We have histograms too!

Summary

I connected to ADWC using the HIGH consumer group and within two or three minutes I populated FACT1 with over 160 million rows using a series of bulk load operations. Statistics (including histograms) were maintained automatically.

Not bad!

Part 2 of this series is here.

 

 

 

 

 

How to Generate a Useful SQL Execution Plan

Introduction

There is an old joke the unreconstructed comic Dave Allen used to tell, where a traveler asks a passer-by for directions to a particular town and the passer-by simply says, “Well I wouldn’t start from here if I were you.” When it comes to SQL execution plans, if you start from the wrong place, then you probably won’t make it to your destination.

The purpose of this blog post is to take stock for a moment and present what I consider to be the best ‘default’ methods for collecting SQL execution plans. This post is intended for those of you that don’t have an established method already and want to make sure that you capture something that is actually useful. To clarify what I mean by ‘useful’: I mean a plan that will help you to learn how SQL execution plans work (if you don’t know already) and one that is suitable for figuring out if there is a problem that makes the SQL statement take longer to execute than it should.

A SQL execution plan reveals a great deal about how the Oracle Database plans to execute (or has executed) a SQL statement. Do you need to understand SQL execution plans to be an effective Oracle Database expert? No – but most of us like to learn new things, and it’s fun to take a look inside the machine sometimes.

There’s a lot of opinion in the post, so remember that comments are very welcome.

Yet Another Article on SQL Execution Plans?

I know that there are a LOT of articles and chapters in books about generating SQL execution plans. There is no single ‘right way’, but I want to distill things down to a few cases that will be good-to-go in most scenarios. Why? Well, I get sent quite a large number of SQL execution plans, and I often find myself wishing for that vital piece of information that’s missing from the plan I’ve been sent. In addition, there seems to be some blind spots – useful methods that are often mentioned but often missed. Finally, when I wanted to learn to read plans myself, I found it confusing and frustrating until I realized that there’s a lot of incredibly helpful information provided by the Oracle Database, but you won’t see it if you don’t ask for it!

It is perhaps easy to believe that you are the only one to think that SQL execution plans are difficult to understand. Often they are difficult to understand – their sheer size can be daunting. Some are almost impossible to evaluate if certain details are missing. They can be confusing because some query transformations and operations will result in reported numbers (such as Rows) being at odds with what you might expect. This won’t prevent you from understanding how queries are executed, but when you start out, it can give you some tough hurdles to leap.

The examples below generate lot of information that is useful but potentially overwhelming (and probably unnecessary at first). Nevertheless, the information is broken down into sections (or available through an Enterprise Manager UI) so it is easy to digest piecemeal or simply ignored until you want to consider it.

I have not listed the output of all the examples below because it would take up too much space, so I uploaded some self-contained scripts to GitHub.

Examples

Here are my suggestions …

Example A

If you can run the query stand-alone using (for example) SQL Plus or SQLcl:

select e.ename,r.rname
from   employees  e
join   roles       r on (r.id = e.role_id)
join   departments d on (d.id = e.dept_id)
where  e.staffno <= 10
and    d.dname in (‘Department Name 1′,’Department Name 2′);

SELECT *
FROM table(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>’ALL +OUTLINE’));

Or, if you don’t want to execute the query:

explain plan for
select e.ename,r.rname
from   employees  e
join   roles       r on (r.id = e.role_id)
join   departments d on (d.id = e.dept_id)
where  e.staffno <= 10
and    d.dname in (‘Department Name 1′,’Department Name 2′);

SELECT *
FROM table(DBMS_XPLAN.DISPLAY (FORMAT=>’ALL +OUTLINE’));

The important feature of this example is that I am using FORMAT=>’ALL +OUTLINE’. Some of you might have come across the undocumented option, FORMAT=>’ADVANCED’. I am not using it here because the content of its output has the potential to be different between releases, but there’s no fundamental reason why you can’t use it. The ‘ALL’ format is documented and ‘OUTLINE’ is mentioned briefly; its basic content is unlikely to change between releases.

Example B

If you cannot run a query stand-alone, you can still get plan information from the cursor cache using a query like this:

SELECT *
FROM table(DBMS_XPLAN.DISPLAY_CURSOR(
  SQL_ID=>’the_SQL_ID’,
  CHILD_NUMBER=>the_child_number,
  FORMAT=>’ALL +OUTLINE’));

You will need the SQL_ID and CHILD_NUMBER of the query you want. There are many ways of doing this, but if you have DBA privilege then you can search for the statement in V$SQL:

select /* MY_TEST_QUERY */
e.ename,r.rname
from employees e
join roles r on (r.id = e.role_id)
join departments d on (d.id = e.dept_id)
where e.staffno <= 10
and d.dname in (‘Department Name 1′,’Department Name 2’);

select sql_id, child_number, sql_text
from v$sql
where sql_text like ‘%MY_TEST_QUERY%’
and sql_text not like ‘%v$sql%’;

The plans above do not include any runtime information, so you will not see how long each part of the plan took to execute or how many rows were actually processed. For example, ‘Rows’ is an estimate; it does not tell you how many rows were actually processed. If you gather and examine runtime information, it is likely that your level of understanding will be enhanced significantly. How do you go about getting it?

Example C

You can use a hint to gather runtime information:

select /*+ gather_plan_statistics */
       e.ename,r.rname
from   employees  e
join   roles       r on (r.id = e.role_id)
join   departments d on (d.id = e.dept_id)
where  e.staffno <= 10
and    d.dname in (‘Department Name 1′,’Department Name 2′);

SELECT *
FROM table(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>’ALLSTATS LAST ALL +OUTLINE’));

This will show you statistics such as the actual number of rows processed (A-Rows), rather than just the estimates (E-Rows). It also includes a column called Starts, which tells you how many times each step was executed. A-Rows, E-Rows and Starts are all incredibly useful if you want to understand a plan.

Example D

If you don’t want to change the query text to add the hint, there is a parameter you can set instead:

alter session set statistics_level=’ALL’;

select e.ename,r.rname
from employees e
join roles r on (r.id = e.role_id)
join departments d on (d.id = e.dept_id)
where e.staffno <= 10
and d.dname in (‘Department Name 1′,’Department Name 2′);

SELECT *
FROM table(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>’ALLSTATS LAST ALL +OUTLINE’));

Example E

DBMS_XPLAN ‘ALLSTATS LAST’ does not give you a continuous view of runtime statistics while a query is executing, but SQL Monitor solves this problem. It requires the Oracle Tuning Pack, so always check the licence user guide for your database version. This tool is fantastic for generating plans and monitoring SQL, and it is available via Enterprise Manager in the Performance Hub. Before I cover that, you can use it on the command line too (a fact that is often missed or forgotten for some reason):

select /*+ MONITOR */
e.ename,r.rname
from employees e
join roles r on (r.id = e.role_id)
join departments d on (d.id = e.dept_id)
where e.staffno <= 10
and d.dname in (‘Department Name 1′,’Department Name 2’);

— Get the SQL ID of the query we just executed
select prev_sql_id
from v$session
where sid=userenv(‘sid’)
and username is not null
and prev_hash_value <> 0;

PREV_SQL_ID
————-
an05rsj1up1k5

set linesize 250 pagesize 0 trims on tab off long 1000000
column report format a220

select
DBMS_SQL_MONITOR.REPORT_SQL_MONITOR
(sql_id=>’an05rsj1up1k5‘,report_level=>’ALL’) report
from dual;

The SQL_ID parameter is optional, but I usually set it explicitly because there might be multiple long-running queries in the system, so the default report will sometimes pick up a different SQL statement to the one I am experimenting with. The database automatically makes long-running queries available to SQL Monitor, but I used a MONITOR hint in this case because the query is very fast and wouldn’t normally show up.

It can be useful to monitor a query while it is executing because you can watch its progress and learn from that. This is where SQL Monitor is really useful because you can watch a query in another session and see its statistics updating continuously. You don’t necessarily have to wait for it to complete to figure out what part of the query is taking a long time, so you can sometimes avoid having to wait for completion. Note that you can get ‘ALL +OUTLINE’ plan details while a query is executing – just use Example B, above.

You can even generate an active HTML report using the command line! This is a great way to capture a SQL execution plan and explore it interactively later on. Just run the report like this:

— spool output to a file, then…
select DBMS_SQL_MONITOR.REPORT_SQL_MONITOR
(sql_id =>’an05rsj1up1k5′,
report_level =>’all’,
type =>’ACTIVE’) report
from dual;

If you spool the output and open it in a browser, you get an interactive HTML page like this:

Bear in mind that the browser requires Internet access because the HTML report downloads some external assets.

Example F

I know that many of you love the command line (and I am the same) but you should check out using SQL Monitor in the Oracle Enterprise Manager Performance Hub. It’s much easier to access interactive SQL Monitor reports and they will refresh continuously as query execution progresses. In addition, it is easy to save these reports and send them to others. Just use the Save button (circled in red, below).

If you hit the ‘Plan’ tab, it can be enlightening to look at a graphical view if the plan is not too large. I like to select ‘Rotate’ to give me a tree that is oriented vertically. Aha – now I can see what the left side and right side of a join actually means! Very broadly speaking, you read trees from the bottom left up. I might blog about this later. In the following example, and in common with the examples above, the database reads DEPARTMENTS first, then joins the rows with EMPLOYEES and then joins these rows with ROLES.

Example G

Finally, there is SQL Developer too!

With DBMS_XPLAN:

SQL Developer and DBMS_XPLAN

Summary

If you want to save and share a plan, then…


More

Check out the self-contained test scripts for this post.

If you want more detail and more options for looking at plans, then check out Maria’s blog posts on DBMS_XPLAN and SQL Monitor.

If you want to generate plans and send a runnable test case to someone else, then check out Test Case Builder and the Oracle Support tool SQLT.

Comments and @vldbb welcome!

How does AUTO_SAMPLE_SIZE work in Oracle Database 12c?

The ESTIMATE_PERCENT parameter in DBMS_STATS.GATHER_*_STATS procedures controls the percentage of rows to sample when gathering optimizer statistics. What percentage of rows should you sample to achieve accurate statistics? 100% will ensure that statistics are accurate, but it could take a long time. A 1% sample will finish much more quickly but it could result in poor statistics. It’s not an easy question to answer, which is why it is best practice to use the default: AUTO_SAMPLE_SIZE.

In this post, I’ll cover how the AUTO_SAMPLE_SIZE algorithm works in Oracle Database 12c and how it affects the accuracy of the statistics being gathered. If you want to learn more of the history prior to Oracle Database 12c, then this post on Oracle Database 11g is a good place to look. I will indicate below where there are differences between Oracle Database 11g and Oracle Database 12c.

It’s not always appreciated that (in general) a large proportion of the time and resource cost required to gather statistics is associated with evaluating the number of distinct values (NDVs) for each column. Calculating NDV using an exact algorithm can be expensive because the database needs to record and sort column values while statistics are being gathered. If the NDV is high, retaining and sorting column values can become resource-intensive, especially if the sort spills to TEMP. Auto sample size instead uses an approximate (but accurate) algorithm to calculate NDV that avoids the need to sort column data or spill to TEMP. In return for this saving, the database can afford to use a full table scan to ensure that the other basic column statistics are accurate.

Similarly, it can be resource-intensive to generate histograms but the Oracle Database mitigates this cost as follows:

  • Frequency and top frequency histograms are created as the database gathers basic column statistics (such as NDV, MIN, MAX) from the full table scan mentioned above. This is new to Oracle Database 12c.
  • If a frequency or top frequency histogram is not feasible, then the database will collect hybrid histograms using a sample of the column data. Top frequency is only feasible when the top 254 values constitute more than 99% of the entire non null column values and frequency is only feasible if NDV is 254 or less.
  • When the user has specified ‘SIZE AUTO’ in the METHOD_OPT clause for automatic histogram creation, the Oracle Database chooses which columns to consider for histogram creation based column usage data that’s gathered by the optimizer. Columns that are not used in WHERE-clause predicates or joins are not considered for histograms.

Both Oracle Database 11g and Oracle Database 12c use the following query to gather basic column statistics (it is a simplified here for illustrative purposes).

SELECT COUNT(c1), MIN(c1), MAX(c1)
FROM  t;

The query reads the table (T) and scans all rows (rather than using a sample). The database also needs to calculate the number of distinct values (NDV) for each column but the query does not use COUNT(DISTINCT c1) and so on, but instead, during execution,  a special statistics gathering row source is injected into the query. The statistics gathering row source uses a one-pass, hash-based distinct algorithm to gather NDV. The algorithm requires a full scan of the data, uses a bounded amount of memory and yields a highly accurate NDV that is nearly identical to a 100 percent sampling (a fact that can be proven mathematically). The statistics gathering row source also gathers the number of rows, number of nulls and average column length. Since a full scan is used, the number of rows, average column length, minimum and maximum values are 100% accurate.

Effect of auto sample size on histogram gathering

Hybrid histogram gathering is decoupled from basic column statistics gathering and uses a sample of column values. This technique was used in Oracle Database 11g to build height-balanced histograms. More information on this can be found in this blog post. Oracle Database 12c replaced height-balanced histograms with hybrid histograms.

Effect of auto sample size on index stats gathering

AUTO_SAMPLE_SIZE affects how index statistics are gathered. Index statistics gathering is sample-based and it can potentially go through several iterations if the sample contains too few blocks or the sample size was too small to properly gather number of distinct keys (NDKs). The algorithm has not changed since Oracle Database 11g, so I’ve left it to the previous blog to go more detail. There one other thing to note:

At the time of writing, there are some cases where index sampling can lead to NDV mis-estimates for composite indexes. The best work-around is to create a column group on the relevant columns and use gather_table_stats. Alternatively, there is a one-off fix – 27268249. This patch changes the way NDV is calculated for indexes on large tables (and no column group is required). It is available for 12.2.0.1 at the moment, but note that it cannot be backported. As you might guess, it’s significantly slower than index block sampling, but it’s still very fast. At the time of writing, if you find a case where index NDV is causing an issue with a query plan, then the recommended approach is to add a column group rather than attempting to apply this patch.

Summary:

Note that top frequency and hybrid histograms are new to Oracle Database 12c. Oracle Database 11g had frequency and height-balanced histograms only. Hybrid histograms replaced height-balanced histograms.

  1. The auto sample size algorithm uses a full table scan (a 100% sample) to gather basic column statistics.
  2. The cost of a full table scan (verses row sampling) is mitigated by the approximate NDV algorithm, which eliminates the need to sort column data.
  3. The approximate NDV gathered by AUTO_SAMPLE_SIZE is close to the accuracy of a 100% sample.
  4. Other basic column statistics, such as the number of nulls, average column length, minimal and maximal values have an accuracy equivalent to 100% sampling.
  5. Frequency and top frequency histograms are created using a 100%* sample of column values and are created when basic column statistics are gathered. This is different to Oracle Database 11g, which decoupled frequency histogram creation from basic column statistics gathering (and used a sample of column values).
  6. Hybrid histograms are created using a sample of column values. Internally, this step is decoupled from basic column statistics gathering.
  7. Index statistics are gathered using a sample of column values. The sample size is determined automatically.

*There is an exception to case 5, above. Frequency histograms are created using a sample if OPTIONS=>’GATHER AUTO’ is used after a bulk load where statistics have been gathered using online statistics gathering.

Efficient Statistics Maintenance for Partitioned Tables Using Incremental Statistics – Part 2

Introduction

This post covers how you can manage optimizer statistics efficiently when you use partition exchange load (PEL). This technique is used when large volumes of data must be loaded and maximum performance is paramount.  It’s common to see it used in decision support systems and large operational data stores.

Make sure you’ve taken a look at Part 1, or you are at least familiar with the concept of incremental statistics so that you know what a synopsis is in the context of a partitioned table.

Efficient Statistics Maintenance for Partitioned Tables Using Incremental Statistics – Part 2

html{font-size:45%;}
tbody{font-family:Arial, Helvetica, sans-serif;
font-size:110%;}
.mycode{font-family: monospace;
font-size:100%;}
.ilc{font-family: monospace;
font-size:100%;}
.u14mnav{font-size:100%;}
.u14logo{font-size:100%;}

var h1Headers = document.getElementsByTagName(“h1”);
for (var i = 0; iThis post covers how you can manage optimizer statistics efficiently when you use partition exchange load (PEL). This technique is used when large volumes of data must be loaded and maximum performance is paramount.  It’s common to see it used in decision support systems and large operational data stores.

Make sure you’ve taken a look at Part 1, or you are at least familiar with the concept of incremental statistics so that you know what a synopsis is in the context of a partitioned table.

Partition Exchange Load

Most of you will be familiar with partition exchange load, but I’ll summarize it briefly to introduce you to the terminology I’ll be using here.

The graphic below represents the process. Firstly, the LOAD table is filled with new data and then exchanged with a partition in the “live” application table (SALES). SALES has partitions for quarter 1 and quarter 2 (Q1 and Q2) and LOAD is exchanged with the empty Q2 partition. The effect of the exchange is to incorporate all of the data in LOAD into SALES by swapping the “identity” of LOAD with Q2. The exchange is a logical operation: a change is made in the Oracle data dictionary and no data is moved. The data in LOAD is published to SALES “at the flick of a switch”.

Partition Exchange Load

Typically, the exchange step looks like this:

alter table SALES exchange partition Q2 with table LOAD
including indexes without validation;

Operationally, this approach is more complex than inserting data directly into SALES but it offers some advantages. For example, new data can be inserted into LOAD before any indexes have been created on this table. If the volume of data is large, creating indexes at the end of the load is very efficient and avoids the need to bear the higher cost of index maintenance during the load. The performance benefit is especially impessive if data is loaded at very high rates in parallel.

The exact steps you need to execute for a partition exchange load will vary depending on the type of partitioning you use, whether there are local or global indexes on the table and what constraints are being used. For the purposes of this blog post I’m going to stick to how you manage statistics, but you can find details on how to deal with indexes and constraints in the Database VLDB and Partitioning Guide.

When new data is loaded into a table, optimizer statistics must be updated to take this new data into account. In the example above, the global-level statistics for SALES must be refreshed to reflect the data incorporated into the table when LOAD is exchanged with Q2. To make this step as efficient as possible SALES must use incremental statistics maintenance. I expect you’ll have guessed from the title of this post that I’m going to assume that from now on! I’m also going to assume that the statistics on SALES are up-to-date prior to the partition exchange load.

Oracle Database 11g

The moment after LOAD has been exchanged with Q2 there will be no synopsis on Q2; it won’t have been created yet. Incremental statistics requires synopses to update the global-level statistics for SALES efficiently so a synopsis for Q2 will be created automatically when statistics on SALES are gathered. For example:

EXEC dbms_stats.gather_table_stats(ownname=>null,tabname=>’SALES’)

Statistics will be gathered on Q2 and the synopsis will be created so that the global-level statistics for SALES will be updated. Once the exchange has taken place, Q2 will need fresh statistics and a synopsis and it might also need extended statistics and histograms (if SALES has them). For example, if SALES has a column group, “(COL1, COL2)” then Q2 will need these statistics too. The database takes care of this automatically, so there’s no requirement to create histograms and extended column statistics on LOAD prior to the exchange because they are created for you when statistics are gathered on SALES.

There is nevertheless a scenario where you might want to gather statistics on LOAD prior to the exchange. For example, if it’s likely that Q2 will be queried before statistics have been gathered on SALES then you might want to be sure that statistics are available on Q2 as soon as the exchange completes. This is easy to do because any statistics gathered on LOAD will be associated with Q2 after the exchange. However, bear in mind that this will ultimately mean that statistics for the new data will be gathered twice: once before the exchange (on the LOAD table) and once again after the exchange (for the Q2 partition when SALES statistics are re-gathered). Oracle Database 12c gives you an alternative option, so I’ll cover that below. 

If you want to know more about extended statistics and column usage then check out this post. It covers how you can seed column usage to identify where there’s a benefit in using extended statistics. Note that some column usage information is always captured to help identify columns that can benefit from histograms. This happens even if you don’t choose to seed column usage.

Oracle Database 12c

Oracle Database 12c includes an enhancement that allows you to create a synopsis on LOAD prior to the exchange. This means that a synopsis will be ready to be used as soon as the exchange has taken place without requiring statistics to be gathered on Q2 post-exchange. The result of this is that the global-level statistics for SALES can be refreshed faster in Oracle Database 12c than they can be in Oracle Database 11g. This is how to prepare the LOAD table before the exchange:

begin
   dbms_stats.set_table_prefs (null,’load’,’INCREMENTAL’,’TRUE’);
   dbms_stats.set_table_prefs (null,’load’,’INCREMENTAL_LEVEL’,’TABLE’);
   dbms_stats.gather_table_stats (null,’load’);
end;
/

Q2 will have fresh statistics and a synopsis as soon as the exchange completes. This isn’t quite the end of the story though. Statistics on Q2 will be gathered again after the exchange (when statistics are gathered on SALES) unless you have created appropriate histograms and extended statistics on LOAD before the exchange. The list_s.sql script in GitHub displays extended statistics and histograms for a particular table if you want to take a look at what you have. If you are using METHOD_OPT to specify exactly what histograms to create on SALES then you can use the same METHOD_OPT for gathering statisitcs on LOAD. For example:

Table preference…

dbms_stats.set_table_prefs(
ownname=>null,
tabname=>’SALES’,
method_opt=>’for all columns size 1 for columns sales_area size 254′);

Then…

   dbms_stats.set_table_prefs (null,’load’,’INCREMENTAL’,’TRUE’);
   dbms_stats.set_table_prefs (null,’load’,’INCREMENTAL_LEVEL’,’TABLE’);

select dbms_stats.create_extended_stats(null,’load’,'(col1,col2)’) from dual;
   dbms_stats.gather_table_stats(
ownname=>null,
tabname=>’LOAD’,
method_opt=>’for all columns size 1 for columns sales_area size 254′);

Alternatively, if you are using the default ‘FOR ALL COLUMNS SIZE AUTO’ to gather statistics on SALES, then it’s usually best to preserve automation and exchange without creating histograms on LOAD. This allows stats gathering on SALES to figure out what histograms are needed for Q2 post-exchange. Statistics on Q2 will be gathered post-exchange if SALES has column usage information indicating that there are columns in Q2 that don’t have a histogram but might benefit from having one. Also, as mentioned above, extended statistics will be maintained automatically too.

Summary of Steps

If you are using Oracle Database 12c then you can minimize the statistics gathering time for SALES (post-exchange) if you create a synopsis on LOAD along with appropriate histograms and extended statistics. For Oracle Database 11g, statistics will always be gathered on Q2 once the exchange has completed. Here are the steps (bearing in mind I’m sticking to statistics maintenance and not including steps to manage indexes and constraints etc):

  1. Create LOAD table and insert new data (or CREATE TABLE load AS SELECT…)
  2. Create a new (empty) partition for SALES (Q2)
  3. Populate LOAD with data
  4. Optionally (Oracle Database 12c) – follow these steps if you want Q2 to have valid statistics immediately after the exchange:
      • Set INCREMENTAL to ‘TRUE’ and INCREMENTAL_LEVEL to ‘TABLE’ for LOAD table
      • Create extended statistics on LOAD to match SALES
      • Gather statistics on LOAD using METHOD_OPT parameter to match histograms with SALES
      1. Optionally (Oracle Database 11g) – follow these steps if you want Q2 to have valid statistics immediately after the exchange:
        • Create extended statistics on LOAD to match SALES
        • Gather statistics on LOAD using METHOD_OPT parameter to match histograms with SALES
      2. Exchange LOAD with Q2 (this will exchange synopses in Oracle Database 12c, basic column statistics and histograms)
      3. Gather statistics for SALES. Oracle Database 12c will complete this step more quickly if you implemented “4”, above. 

      If, in the past, you have used partition exchange load and gathered statistics in an ad-hoc manner then you should probably check that the histograms you have match your expectations when comparing table-level histograms with histograms on partitions and sub-partitions. I’ve included a script in GitHub to help you do that.

      Composite Partitioned Tables

      If you are using a composite partitioned table, partition exchange load works in the same way as described above.  If you would like to experiment with a complete example, I’ve created a script called example.sql here.

      TEL/電話+86 13764045638
      Email service@parnassusdata.com
      QQ 47079569