The Oracle Optimizer and ADWC – Statistics-Based Query Transformation

This is Part 2 of a series on the Oracle Optimizer in the Oracle Autonomous Data Warehouse Cloud. You can find Part 1 here. In this post, I'm going to cover a clever optimization included in ADWC: Statistics-Based Query Transformation. I'm sure that you can guess what it is already, but let's take it step-by-step.

Consider the following query:

SELECT max(sale_value) FROM    huge_fact_table;

To find MAX, the Oracle Database would normally need to scan all of the rows in the huge fact table to find the maximum value (I'm assuming that there's no index on SALE_VALUE). This scan can be very time-consuming, but there is a potential short-cut. When statistics are gathered on a table the database retains information on how may rows there are as well as details of column data such minimum and maximum values and the number of distinct values (NDV). If the database is sure that statistics are up-to-date, then it is clearly in a position to answer some queries using the information it has already gathered.

Here is an example I executed on ADWC using the HIGH consumer group:

select max(num0),min(num1) from fact1; --------------------------------------------------------------- | Id | Operation | Name | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | RESULT CACHE | cpuu29wy174jfbu3a7zcxzkw12 | | 2 | VIEW | VW_SQT_65BBF4BE | | 3 | SORT AGGREGATE | | | 4 | PX COORDINATOR | | | 5 | PX SEND QC (RANDOM) | :TQ10000 | | 6 | SORT AGGREGATE | | | 7 | PX BLOCK ITERATOR | | |* 8 | TABLE ACCESS FULL| FACT1 | ---------------------------------------------------------------

The first thing of note is that we use the server result cache, but I'm going to stick to the topic at hand an concentrate on operation ID 2: VW_SQT_65BBF4BE. This tells us that the statistics-based query transformation has kicked in and, potentially, the values of max(num0) and max(num1) can be retrieved from the statistics stored in the data dictionary. We are in a position to avoid the need to scan FACT1.

The word 'potentially' might sound rather vague, but bear in mind that the query transformation is established when the query is parsed, so when it is executed it is possible that the data in FACT1 has been updated in some way. This will prevent us from being able to use the dictionary statistics because the data might no longer match the statistics. In ADWC, the easiest way to see that statistics have actually been used is to observe the much-reduced query response time.

As you might expect, there are restrictions regarding when this transformation can and cannot be used. I've included some queries in the GitHub repository that use the transformation and some that don't. Hopefully, this will be enough to get you started if you want to explore the boundaries. There's also a spool file if you can't run the example yourself.

Part 3 of this series is here.

As ever, comments welcome!






TEL/電話+86 13764045638
QQ 47079569