Author: Nigel Bayliss

What’s New in the Oracle Optimizer for Oracle Database 19c?

For those of you that want to keep up-to-speed with the latest changes, this post includes a quick summary of the headline new features in Oracle Database 19c with links to the documentation. In addition, most of the new optimizer white papers have been published. You will find links to them in the new Oracle Query Optimization page (and there will be further updates to this page soon).

Automatic Indexing

This is a complex and sophisticated piece of functionality but, ironically, it requires only a tiny API because it has been designed to be very easy to use and require virtually zero manual intervention. I’m sure that you can expect a lot to be written about this feature, but you can find material in the SQL Tuning Guide and the DBMS_AUTO_INDEX API.

Real-Time Statistics

Statistics can now be collected ‘on-the-fly’ during conventional DML operations. This feature requires no manual intervention, but you can find details in the SQL Tuning Guide’s Optimizer Statistics Concepts chapter.

High-Frequency Statistics Gathering

Real-time statistics are a complement to conventional gathered statistics, so there is still a need for both. To help keep gathered statistics up-to-date, we added a new task to do it more frequently. It is again documented in the SQL Tuning Guide.

Automatic SQL Plan Management

There is a lot to say about this feature, so my intention is to post a blog in the next few of weeks. We have included more automation in the SQL evolve advisor task, so see the new features guide and the section on managing SQL plan baselines.

I noticed that there are a couple of documentation bugs in the DBMS_SPM API documentation. I’m getting that fixed, but in the meantime note that the following SQL plan management task parameters have new default values in Oracle Database 19c: 

PARAMETER_NAME DEFAULT
——————————– ———-
ALTERNATE_PLAN_BASELINE AUTO
ALTERNATE_PLAN_LIMIT UNLIMITED
ALTERNATE_PLAN_SOURCE AUTO

If you currently use SQL plan management, especially if you use it to micro-manage individual SQL execution plans, then you need to be aware of the new default behavior when upgrading to Oracle Database 19c. You do have a choice: you can use the new defaults immediately or, if you prefer, you can set these parameters back to the pre-19c defaults very easily.

An additional piece of SQL plan management functionality was added late-in-the-day, so that will make its way through soon. I will post an update here when it’s documented.

Hint Usage Reporting

Diagnosing SQL statement hint issues can be difficult. More information on how we have made is easier can be found in the documentation and in this blog post. Using this feature to diagnose issues with SQL plan baselines is presented in this blog post.

SQL Plan Comparison

We’ve made it much easier to compare a ‘reference’ SQL execution plan with multiple plans from a variety of different sources using a single API call. It’s a great tool for diagnosis and you can find details in the section on comparing execution plans in the documentation. The API is DBMS_XPLAN.COMPARE_PLANS.

SQL Quarantine

This feature allows you to prevent runaway SQL statements from consuming all your system resource. It is covered in detail in the Database Administrator’s Guide chapter on Diagnosing and Resolving Problems.

Automatic Diagnosis and Repair

The process of diagnosing and repairing problems has been enhanced to make it easier to treat issues with the DBMS_SQLDIAG API and the addition of the V$SQL_TESTCASES view.

There’s more in in the Database Administrator’s Guide.

Licensing

Not all of these features will be available on all database platforms. Consult the Database Licensing Information User Manual for details

 

Oracle Database 19c and SQL Plan Management Diagnostics

A popular enhancement request I see is to provide an easier way to diagnose issues with SQL plan baselines; in particular the situation where a SQL plan baseline is ACCEPTED but is not being used for whatever reason. This is rare, but can happen if changes are made to the database such as dropping indexes or changing partitioning schemes. If a SQL plan baseline can’t be used, you will see something like this in Oracle Database 19c:

So why did it fail?

In this example I captured a plan that uses an index and then I made the index invisible. There’s no way the index plan can be used anymore. However, let’s pretend that we don’t know what happened.

There is now a really nice way to help diagnose issues with SQL plan baselines. It relies on Oracle Database 19c (hint usage reporting) and a hidden parameter to force the SQL statement to use the outline in the SQL plan baseline even if the resulting plan doesn’t match the SQL plan baseline (a pretend match).

This is how you do it:

alter session set “_sql_plan_management_control”=4;
explain plan for select /* MYTESTSQL */ sum(num) from mytest1 where id = 10;
select * from table(DBMS_XPLAN.DISPLAY(FORMAT=>’typical’));
alter session set “_sql_plan_management_control”=0;

And then you will see something like this:

How beautiful is that? The hint report tells us that INDEX_RS_ASC is not used – a really strong clue.

There’s a worked example on GitHub if you want to try it yourself. There’s also a spool file if you want to look at the example but don’t have access to a version 19c database at the moment. I’m afraid that LiveSQL doesn’t allow you to use SQL plan management yet.

Hey, what? Forcing the plan baseline plan?

Some of you might jump on the idea that a parameter can be used to “force a SQL plan baseline to be used” (another popular request). This is not the case! As I said, it forces the outline in the SQL plan baseline to be used even if the plan is not the one we want. The parameter is not a magic bullet that will somehow force the optimizer to use the plan in the SQL plan baseline. You can see from this example that it is just not possible: the index is not available for use.

In other words – if the outline in the SQL plan baseline can be used successfully, then it will be used. Telling the optimizer to pretend-match the outline in the SQL plan baseline won’t somehow fix the issue. This is demonstrated above – the outline is applied to the SQL statement but the hints it uses cannot be honored. The FULL plan is still used. 

 

LiveSQL Now Live on Oracle Database 19c

Since LiveSQL is running on Oracle Database 19c, now is the time to be one of the first to take a look at the new Hint Usage Reporting feature.

Trying to figure out why a hint is not being used is not on my top-100-fun-things-to-do list. Luckily, there’s now a way to do it much more easily than before.

Here is a very simple example – but it’s just the tip of the iceberg…

create table mytab (a number(10) primary key);
insert into mytab values (1);
commit;
exec dbms_stats.gather_table_stats(null,’mytab’);

explain plan for select /*+ FULLL(t) */ * from mytab t;

SELECT * FROM table(DBMS_XPLAN.DISPLAY(FORMAT=>’TYPICAL’));

SELECT * FROM table(DBMS_XPLAN.DISPLAY(FORMAT=>’BASIC +HINT_REPORT’));

Here’s some example output. Hint – take a look at the bottom of the report. 🙂

——————————————————–
| Id | Operation | Name |
——————————————————–
| 0 | SELECT STATEMENT | |
| 1 | INDEX STORAGE FAST FULL SCAN| SYS_C0010006659 |
——————————————————–

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E – Syntax error (1))
—————————————————————————

1 – SEL$1
E – FULLL

Plan Stability in Oracle Database 18c Standard Edition

Oracle Database 18c Standard Edition includes a new way to control SQL execution plans.

Stored outlines were deprecated in Oracle Database 11g Release 1 and SQL plan management (SPM) was only available in Oracle Database Enterprise Edition. This meant that many DBAs continued to use stored outlines in Standard Edition and they continue to work to this day.

At some point in the future we will begin to remove support for stored outlines so, to avoid leaving a hole in SE, we have made a subset of SQL plan management available in this release from Oracle Database 18c onwards.

The license guide remains the single source of truth, so you should always refer to this for full details, but the idea is to give you functionality that’s similar to stored outlines using SQL plan baselines. In SE, you can now store a single SQL plan baseline per SQL statement and you have the option to migrate stored outlines directly to SQL plan baselines.

I’ve provided some example SPM scripts in Github for Enterprise Edition and Standard Edition. The SE approach is very similar to the EE approach (as outlined in this post).

SPM in SE

If you have an questions or feedback regarding these scripts, then let me know by leaving a comment below.

 

 

Our Optimizer Sessions at Oracle OpenWorld 2018

It’s nearly time for Oracle OpenWorld so I thought I’d share the details of a couple of Optimizer-related sessions. You will find full details in the session catalog, but here’s a summary of what we have for you.

The following session is a rundown of the features available in the latest Oracle Optimizer. There’s a lot to cover, so it’s going to be a flash-card introduction, but don’t worry, the intention is to give you an overview and you can always find out more from speaking to the developers in the demo booths. 

One of the major new features for the optimizer is Automatic Indexing – so I’m sure you will want to know about it in detail. In the following session, Maria Colgan and the Development Manager for the Oracle Optimizer will be explaining how it works in the Autonomous Database Cloud:

If you’re coming to OOW but can’t make the sessions, or you simply want to know more, then come and speak to the Oracle developers themselves at the demo booths in Moscone South.

See you there!

Fetch First Rows Just Got Faster

Many applications need to paginate rows fetched from the database, or at least retrieve the first N rows. In most cases the data needs to be returned in some kind of order too. If you are an old-school developer, then you are likely to use a variation on this theme:

select * from
( select *
from the_table
order by object_id )
where rownum <= 10;

It’s not pretty, but it is effective. In fact, it is very effective if the column (or columns) in the ORDER BY have a b-tree index. Oracle reads the index entries in order so that it can avoid having to sort the entire result set. This can speed things up very considerably.

If you are new-school, then this is what you probably use instead:

select *
from the_table
order by object_id
fetch first 10 rows only;

This is much prettier, but I’m afraid it has not always been as effective. The snag has been that the optimizer did not always cost this type of query correctly. This meant that the index access ‘trick’ was sometimes missed, potentially resulting in a large sort.

Now for the good news: a one-off patch (22174392) is available and it works with OFFSET too:

This change can have a dramatic effect on the performance for these types of queries, so it’s well worth a look if you have FETCH FIRST ROWS queries.

I’ve uploaded a demo with before and after examples to GitHub

Comments welcome!

 

 

How to Gather Optimizer Statistics Fast!

There are a number of ways to speed up the process of gathering optimizer statistics, but I’m not sure that it’s common knowledge just how much of an effect some simple changes can make. If you have been asking yourself, “why is stats gathering taking so long and what can I do about it?”, then this post is for you.

If you are already familiar with the different methods of gathering optimizer statistics, you might want to jump to the end of this post where I compare them and make some recommendations.

Overview

The perception that it’s difficult to speed up statistics gathering has sometimes motivated DBAs to manipulate the number of rows sampled on a table-by-table basis using the ESTIMATE_PERCENT parameter (or DBMS_STATS preference). For example, large tables may have estimate percent set to 1% and small tables, 100%. Legacy scripts play their part too: some systems are still using procedures established before the performance enhancements available with auto sample size. One of the reasons we recommend ESTIMATE_PERCENT=>AUTO_SAMPLE_SIZE is that it includes number of distinct value (NDV) optimizations that yield high performance and accurate statistics. Gathering statistics using a 1% sample of rows might complete very quickly, but inaccurate statistics are the likely result, along with sub-optimal SQL execution plans.

Instead of manipulating ESTIMATE_PERCENT, the time taken to gather statistics can be reduced by using more machine resources. This post compares some before-and-after scenarios to demonstrate how you can do this. Fully worked examples are available in GitHub.

I will concentrate on using automatic optimizer statistics gathering, but the lessons are broadly applicable to manual statistics gathering too (there’s an example at the end of the post). The examples are intended for use on Oracle Database 12c and Oracle Database 18c. The same techniques are applicable to Oracle Database 11g, but note that the resource consumer groups have different names in that release.

Option #1 – Default Statistics Gathering

Consider the following trace of CPU consumption over time: 

CPU, simple gather

It shows my CPU utilization while the automatic statistics gathering job is running and there’s not much else happening on the system. Notice that about 75% of the CPU is not utilized. This fact is easy to understand once you know that the environment has a 4-core CPU with one thread per core. By default, statistics gathering uses a single process (with a single worker-thread) and this will utilize the processing power of a single CPU core. In my case, this equates to a utilization of 25% (one quarter of the 4-core chip). For systems with a higher core count, the single process will utilize an even smaller proportion of the available CPU.

Gathering statistics like this is not necessarily a problem. If stats gathering runs to completion most nights and there’s no urgent need to have fresh statistics by a certain time, then there’s no need to do anything more. Always keep things as simple as possible and only make changes if you need to. If your environment is large and/or volatile, the auto statistics job might regularly fail to run to completion in the batch window. In other words, the window might close before all tables considered stale have fresh statistics. If this is the case, then some tables might remain stale for a long time.

Fortunately, this situation is easy to see. If you view the statistics advisor report available in Oracle Database 12c Release 2, then it will tell you. The data dictionary stored this information too. In the example below, my batch window is 20 minutes long and the auto stats job has sometimes failed to complete (status STOPPED). The JOB_INFO column reveals the reason: auto statistics collection is occasionally taking longer than 20 minutes and terminates when the batch window closes.

Task History

How can we fix this? We could (a) make the batch window longer and/or (b) speed up statistics gathering. I am going to consider option b (because option a is less interesting).

How do you speed up statistics gathering? If you have resources on your database server, then you could dedicate more of it to gather statistics. You can reduce the elapsed time of gathering statistics at the cost of a more fully utilized database server. It is of course necessary to identify a window of time where there’s spare system resource, so this solution requires that the system is not running at 100% all of the time.

It is worth noting that other techniques are available to reduce the time required to maintain statistics (such as incremental statistics maintenance), but this is out of scope for the purposes of this blog post.

Option #2 – Gathering Statistics in Parallel – AUTO_DEGREE

Gathering statistics with auto sample size initiates full table scans to inspect table data. We can leverage parallel execution to make these scans complete in less time. To do this you can, for example, identify large tables and define a specific degree of parallelism (DOP):

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, 16)

There is an easier, set-and-forget approach where you can let Oracle to decide on the DOP for you:

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

A clean and simple approach is to set the property at the global level:

exec dbms_stats.set_global_prefs(‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

With parallel execution in play, statistics gathering has the potential to consume lots of system resource, so you need to consider how to control this. When the auto stats gathering job executes it (by default) uses the resource management plan DEFAULT_MAINTENANCE_PLAN and a consumer group called ORA$AUTOTASK. This makes it very easy to make some adjustments and control just how much resource you want to dedicate to gathering statistics.

You first need to decide what priority to attribute to auto stats gathering depending on what other processes are likely to be running at the same time. In the following example, the auto stats job has a minimum of 5% CPU if other tasks running in the database are competing for CPU. However, if the system is not busy, we will allow the job to consume up to 80% of the CPU (this will protect processes that must run outside the control of the database). The maximum degree of parallelism an individual session can use is four in this case. It is useful to control the maximum DOP because you will want to make sure that you do not reach the maximum number of parallel server processes allowed for the system (this will become more relevant later in this post).

RM Plan

For completeness, the example above includes all plan directives for the DEFAULT_MAINTENANCE_PLAN, but it is only necessary to specify the plan directives you want to modify. By default, when the maintenance windows opens, it will activate the DEFAULT_MAINTENANCE_PLAN. If you prefer, you can create your own resource management plan and associate it with any maintenance windows of your choosing. If you also set the resource_management_plan initialization parameter, then you can use the same resource management plan when the batch windows are both open and closed. Here’s an example:

Bespoke RM Plan

When AUTO_DEGREE is used the resource utilization can look very different . In this example, the tables are all identical so there’s a very regular pattern:

CPU auto DOP

We are now using much more CPU, and consequently the job completes in only 12 minutes and 26 seconds (where, previously, it failed to complete within the 20-minute window):

Job status

Remember that database resource management (DBRM) is in force during the batch window, so it is very easy to adjust CPU utilization even while the job is running. For example – consider what happens when I adjust the utilization limit down from 80% to 40% and then back again:

RM CPU

Let’s look at a more realistic AUTO_DEGREE scenario. In the following example we have a schema containing tables that have a wide variation in size. The CPU profile is now less consistent:

Auto DOP

The DOP is changing in response to the size of each individual table. The job runs serially at first (about 25% CPU), then DOP 2 for a while, then DOP 3 and then back to serial. We could micro-manage DOP on a table-by-table basis, but it is much better to avoid approaches like this because we should always aim to avoid too much manual intervention. The global AUTO_DEGREE solution will be good enough in many cases, so there will be no need for any further manual intervention.  

Option #3 – Gathering Statistics Concurrently – CONCURRENT

Parallel statistics gathering has enabled us to increase CPU utilization significantly, but what if we have spare machine resources and want to go even faster? In the previous example, the CPU could be more fully utilized. If you want to achieve that, then how do you go about it?

Firstly, disable parallel execution (we will come back to that later):

exec dbms_stats.set_global_prefs(‘DEGREE’, 1)

The CONCURRENT preference allows DBMS_SCHEDULER to initiate multiple statistics gathering jobs at once, so that the database will gather statistics on multiple tables and partitions concurrently. We can choose to enable this behavior for auto stats gathering only:

exec dbms_stats.set_global_prefs(‘CONCURRENT’,’AUTOMATIC’)

The database will now gather statistics using multiple scheduler jobs. In my case, auto stats initiated 16 job processes and the CPU profile looked like this:

CPU Conc

I am using max_utilization_limit set to 80%, and the job completes in 11 minutes and 39 seconds:

Concurrent Results

Concurrent statistics gathering works very well if tables are of a similar size, but without parallel execution, serial jobs running on very large tables can take a long time and the might not complete before the batch window closes. If this is a problem for you, you can use a combination of concurrent processing and parallel execution.

Option #4 – Gathering Statistics Concurrently and in Parallel – CONCURRENT and AUTO_DEGREE

Care is required when implementing concurrency and parallel execution because there’s scope to execute a very large number of concurrent parallel execution servers and generate a very high system load. Multiple jobs will start and each has the potential to initiate a number of parallel query servers. As a very general rule of thumb, you want to have no more than about 2*CPUCoreCount to 4*CPUCoreCount parallel servers executing at any one time.

You can mitigate the risk of initiating too many parallel execution servers as follows:

Concurrent and Auto DOP

There is currently no way to cap the number of job queue processes allocated to concurrent stats gathering, so ‘turning down’ the job_queue_processes setting is the only way to do this. I have created an enhancement request with respect to this limitation.

Enable concurrent stats gathering for the automatic statistics gathering job:

exec dbms_stats.set_global_prefs(‘CONCURRENT’,’AUTOMATIC’)

Set AUTO_DEGREE globally:

exec dbms_stats.set_global_prefs(‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

Or, for individual large tables:

exec dbms_stats.set_table_prefs(user, ‘BIG_TABLE’, ‘DEGREE’, DBMS_STATS.AUTO_DEGREE)

On my system, the auto stats initiated 16 job processes and the resource management plan I used limited DOP to four. This resulted in 64 parallel execution servers executing concurrently, so a DOP limited to two might have been a better choice in my case. Here is the new CPU profile:

CPU Concurrent and Auto DOP

The job completed in 13 minutes 16 seconds:

Concurrent + Auto DOP results

In this case, why was there no benefit in run time using CONCURRENT and AUTO_DEGREE? It is because CONCURRENT without AUTO_DEGREE consumed CPU up to the 80% limit imposed by DBRM. In the more general case where there is a good mix of table sizes, some very large tables and a faster IO subsystem, then AUTO_DEGREE used in combination with CONCURRENT has the potential to yield the shortest gather stats times.

If you are in a position to be able to decrease job_queue_processes to limit the number of jobs that execute concurrently, then you will be able to increase the DOP limit to a higher values: 

Higher DOP

The same is true if there are only a small number of tables where parallelism is used.

Oracle Multitenant

At the time of writing there’s a bug with parallel statistics gathering in a multitenant database when used with CONCURRENT (unpublished bug# 27249531). Parallel execution servers initiated by gather stats are not constrained by max_utilization_limit. This can result in high CPU consumption. Using DEGREE above 1 or AUTO_DEGREE is OK if CONCURRENT is not used.

Until a fix is available for your platform and version, the best solution is to use DEGREE=>1 if you want to use CONCURRENT in multitenant environments.

Manual Statistics Gathering

If you want to initiate stats gathering manually, and still make full use of parallel and concurrent settings, then you can use the following approach:

Manual Example

Performance Comparisons

A small test system was used for the examples above, so it will be useful to see what an enterprise-class system looks like (let’s say 72 cores with HT). The Oracle Real World Performance Group ran some tests to check out the different techniques.

The relative performance of the stats gathering methods will be different on every system you try, so treat this as entertainment rather than science. For example, the test tables were all large and all the same size, so this will work in favor of AUTO_DEGREE (used without CONCURRENT) because a high degree of parallism was used for every table.

A large number of CPU cores will make the default method look exceptionally underpowered. In this case the CPU is only 1.5% utilized; a single core in an enterprise-class system:

Performance Test Results

Here are the results plotted:

Plots

Summary

Remember that you will need spare machine capacity to gain benefit from the techniques outlined in this blog post.

Generally speaking, option #2 is most likely to give you a quick and easy win if there are a number of very large tables. Option #3 is great if you have plenty of spare machine resource and a large number of smaller tables.

Option #4 requires more care to avoid initiating too many parallel execution servers.

Options #3 and #4 are particularly useful if you need to get a one-off stats gathering task done very quickly: perhaps when you are commissioning a new deployment or gathering statistics after an upgrade.

Here’s a high-level summary:

Summary

I’ve uploaded self-contained test scripts to GitHub.

Comments welcome!

Fine-Grained Cursor Invalidation

This feature made its debut in Oracle Database 12c Release 2 and enhancements in Oracle Database 18c allow it to be used in a broader range of scenarios. SQLMaria mentioned it in a blog post so I realized that I am overdue in producing a blog post for this cool feature.

What is Fine-Grained Cursor Invalidation?

You might be familiar with the concept of rolling or deferred cursor invalidation because DBMS_STATS uses it if you gather statistics with NO_INVALIDATE=>AUTO_INVALIDATE (this is the default). Fine-grained cursor invalidation is similar except that it is relevant when DDL commands are used to change schema objects; adding indexes or changing partitions and so on.

Consider a SALES table that has a number of SQL statements in the shared pool that access it. These statements are referred to as dependent cursors. If we make any changes to SALES (such as adding an index or refreshing statistics) then we should invalidate and re-parse dependent cursors to take these changes into account.

If SALES is a ‘hot’ application table and it has a large number of dependent cursors then our DDL might cause a sudden surge in hard parses. An increase in hard parse rate on a busy system can have a detrimental effect on database performance, but if we spread re-parse activity over a period of time then performance is much more likely to remain steady.

Prior to Oracle Database 12c Release 2, DDL invalidated dependent cursors immediately, but with fine-grained cursor invalidation the database can often defer invalidation and spread the cost of re-parsing dependent cursors.

What is Rolling Invalidation?

A cursor marked for rolling invalidation will become eligible for re-parse after a random amount of time (up to a maximum). A newly generated random time is assigned to each dependent cursor. Randomization therefore spreads the overhead of re-parsing dependent cursors over time.

How does Fine-Grained Cursor Invalidation Work?

During cursor compilation the database annotates cursors with fine-grained dependency information (from Oracle Database 12c Release 2 onwards). For example, “this cursor reads index SALES_IDX” or “this cursor does static partition pruning and only accesses partitions in the range [P1, P10]” or “this cursor does a partition related optimization that depends on the partition scheme of the tables involved, e.g. partition-wise join”.

Commonly-used DDLs define a set of rules that consult each cursor’s fine-grained dependencies to decide how the cursor should be processed.

There are four actions that we can take for a cursor:

  1. Cursor remains valid — this is used when we know that the cursor is safe to execute and that the cursor’s plan is still optimal. For example, modifying a table partition to read only.
  2. Cursor is marked for rolling invalidation — this is used when we know that the cursor is safe to execute but its plan may be sub-optimal. For example, if we create a visible index, then the cursor is safe to execute but we want to recompile it sometime in the future so the new index can be considered for the plan.
  3. Cursor is marked for rolling invalidation with metadata refresh — this is used when the cursor is safe to execute after refreshing some of its metadata at runtime. For example, if we rebuild an index that is used by the cursor, then the cursor can be executed if we refresh the metadata for the index so the cursor uses the new index segment.
  4. Cursor is invalidated — this is used if the cursor is no longer safe to execute. For example, if we drop an index that is used by the cursor, we must invalidate the cursor so it is recompiled to get a new plan that does not use the index.

Note that actions 1, 2, and 3 correspond to DDL_NO_INVALIDATE, IS_ROLLING_INVALID, and IS_ROLLING_REFRESH_INVALID columns in V$SQL. Action 4 is simply the existing invalidation action that was used prior to the fine-grained invalidation feature.

Action 1 is enabled by default; i.e. we will use action 1 when applicable without any syntax or parameter required.

Actions 2 and 3 are only used if DEFERRED INVALIDATION syntax is used, or if CURSOR_INVALIDATION parameter is set to DEFERRED (see below).

Summary

Fine-grained cursor invalidation reduces the risk of performance degradation associated with DDL operations on objects in you application schema.

More

As mentioned above, the feature is controlled using DDL statements that have an additional DEFERRED VALIDATION clause (for example, ALTER TABLE). There is also a CURSOR_INVALIDATION initialization parameter.

More detail is available in the SQL Tuning Guide. Note that we have not documented the multitude of factors that affect where this feature will or will not kick in. This is where the differences between Oracle Database 12c and Oracle Database 18c are, so the changes between these two releases will not be obvious from the documentation. Nevertheless, if you want to experiment, I have linked to some scripts below that will give you a good place to start. You will see that DML cursors are treated differently to queries because queries are generally less sensitive to change than DML. A SQL statement’s eligibility for deferred invalidation is also dependent on access method (e.g. whether its SQL execution plan uses a modified index) and also whether partitions are statically pruned or accessed via KEY.

Big thanks to George Eadon for putting together the ‘how does it work?’ section.

If you want to try this feature out, I’ve uploaded examples to GitHub. The tests are in a very raw state, but I explain that in the README file.

The Oracle Optimizer and ADWC – Hints

This is Part 3 of a series on the Oracle Optimizer in the Oracle Autonomous Data Warehouse Cloud. You can find part 1 here and part 2 here.

It’s time to take a look at optimizer hints. Here’s our test query:

select sum(t1.num), sum(t2.num)
from table1 t1
join table2 t2 on (t1.id = t2.id);

Executing on an ADW database (using the LOW consumer group) yields this plan:

—————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
—————————————————————————————-
| 0 | SELECT STATEMENT | | | | 4 (100)|
| 1 | RESULT CACHE | 86m6ud7jmfq443pumuj63z1bmd | | | |
| 2 | SORT AGGREGATE | | 1 | 52 | |
|* 3 | HASH JOIN | | 1 | 52 | 4 (0)|
| 4 | TABLE ACCESS FULL| TABLE2 | 1 | 26 | 2 (0)|
| 5 | TABLE ACCESS FULL| TABLE1 | 1000 | 26000 | 2 (0)|
—————————————————————————————-

There are of course no indexes on the table so this is the best plan (we get a single row from TABLE2 so it leads the HASH join).

I will now try to make the plan worse using a hint:   🙂

select /*+ LEADING(t1 t2) USE_NL(t2) */
sum(t1.num), sum(t2.num)
from table1 t1
join table2 t2 on (t1.id = t2.id);

This doesn’t work – the plan does not change. Take my word for it for now; there is a link to test scripts at the bottom of this post.

Autonomous Data Warehouse Cloud ignores optimizer hints and PARALLEL hints in SQL statements by default. If your application relies on them you can set OPTIMIZER_IGNORE_HINTS to FALSE at the session or system level using ALTER SESSION or ALTER SYSTEM. You can also enable PARALLEL hints in your SQL statements by setting OPTIMIZER_IGNORE_PARALLEL_HINTS to FALSE at the session or system level.

For this example, I used ALTER SESSION to give me the sub-optimal plan I wanted (TABLE1 is now the leading table and it’s a NESTED LOOPS join):

alter session set optimizer_ignore_hints = false;

select /*+ LEADING(t1 t2) USE_NL(t2) */
sum(t1.num), sum(t2.num)
from table1 t1
join table2 t2 on (t1.id = t2.id);

—————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
—————————————————————————————-
| 0 | SELECT STATEMENT | | | | 73 (100)|
| 1 | RESULT CACHE | db11srrdf8ar4d06x4b1j674pp | | | |
| 2 | SORT AGGREGATE | | 1 | 52 | |
| 3 | NESTED LOOPS | | 1 | 52 | 73 (3)|
| 4 | TABLE ACCESS FULL| TABLE1 | 1000 | 26000 | 2 (0)|
|* 5 | TABLE ACCESS FULL| TABLE2 | 1 | 26 | 0 (0)|
—————————————————————————————-

Why is ADWC set up like this? It’s pretty simple: the Oracle Optimizer’s job is to find good SQL execution plans without manual intervention. It is not the application developer’s or DBA’s job, so hints should be avoided as much as possible. Over time, they can prevent applications from taking advantage of new optimization techniques, so try and leave the heavy-lifting to the database. Think autonomous.

If you looked at  part 1 of this series, then you will know that we are careful with this restriction and allow INSERT /*+ APPEND */ by default.

To try this example for yourself, it’s uploaded to GitHub.

Comments and suggestions welcome!

The Oracle Optimizer and ADWC – Statistics-Based Query Transformation

This is Part 2 of a series on the Oracle Optimizer in the Oracle Autonomous Data Warehouse Cloud. You can find Part 1 here. In this post, I’m going to cover a clever optimization included in ADWC: Statistics-Based Query Transformation. I’m sure that you can guess what it is already, but let’s take it step-by-step.

Consider the following query:

SELECT max(sale_value)
FROM    huge_fact_table;

To find MAX, the Oracle Database would normally need to scan all of the rows in the huge fact table to find the maximum value (I’m assuming that there’s no index on SALE_VALUE). This scan can be very time-consuming, but there is a potential short-cut. When statistics are gathered on a table the database retains information on how may rows there are as well as details of column data such minimum and maximum values and the number of distinct values (NDV). If the database is sure that statistics are up-to-date, then it is clearly in a position to answer some queries using the information it has already gathered.

Here is an example I executed on ADWC using the HIGH consumer group:

select max(num0),min(num1) from fact1;

—————————————————————
| Id | Operation | Name |
—————————————————————
| 0 | SELECT STATEMENT | |
| 1 | RESULT CACHE | cpuu29wy174jfbu3a7zcxzkw12 |
| 2 | VIEW | VW_SQT_65BBF4BE |
| 3 | SORT AGGREGATE | |
| 4 | PX COORDINATOR | |
| 5 | PX SEND QC (RANDOM) | :TQ10000 |
| 6 | SORT AGGREGATE | |
| 7 | PX BLOCK ITERATOR | |
|* 8 | TABLE ACCESS FULL| FACT1 |
—————————————————————

The first thing of note is that we use the server result cache, but I’m going to stick to the topic at hand an concentrate on operation ID 2: VW_SQT_65BBF4BE. This tells us that the statistics-based query transformation has kicked in and, potentially, the values of max(num0) and max(num1) can be retrieved from the statistics stored in the data dictionary. We are in a position to avoid the need to scan FACT1.

The word ‘potentially’ might sound rather vague, but bear in mind that the query transformation is established when the query is parsed, so when it is executed it is possible that the data in FACT1 has been updated in some way. This will prevent us from being able to use the dictionary statistics because the data might no longer match the statistics. In ADWC, the easiest way to see that statistics have actually been used is to observe the much-reduced query response time.

As you might expect, there are restrictions regarding when this transformation can and cannot be used. I’ve included some queries in the GitHub repository that use the transformation and some that don’t. Hopefully, this will be enough to get you started if you want to explore the boundaries. There’s also a spool file if you can’t run the example yourself.

Part 3 of this series is here.

As ever, comments welcome!

 

 

 

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569