Month: December 2010

What does ISDEFAULT mean in v$parameter?

There is a common misunderstanding among DBAs about the column ISDEFAULT in the view v$parameter. Some think that when this column is TRUE it means the current value of the parameter is the default value. This leads to wrong conclusions and sometimes w…

What does ISDEFAULT mean in v$parameter?

There is a common misunderstanding among DBAs about the column ISDEFAULT in the view v$parameter. Some think that when this column is TRUE it means the current value of the parameter is the default value. This leads to wrong conclusions and sometimes w…

Exp. with Oracle VM

  It’s more than a year I have been using Oracle VM 2.2.0 for my personal use on intel x86. It’s a perfect solution for testing lab as well. Before using Oracle VM I had to setup two Boxes with Lacie Storage connected with 2 Firewire cards just for testing the 2 node RAC Setup. […]

Exp. with Oracle VM

  It’s more than a year I have been using Oracle VM 2.2.0 for my personal use on intel x86. It’s a perfect solution for testing lab as well. Before using Oracle VM I had to setup two Boxes with Lacie Storage connected with 2 Firewire cards just for testing the 2 node RAC Setup. […]

Wget option to download patches from Metalink

Recently I pathced Grid control to 10.2.0.5, while downloading patch from metalink I noticed WGET option to download patch directly on server. It gives option to download a script or copy to clipboard. After minor modifications related to password (location/proxy if required ) script is ready to execute. Started in nohup Not a bad option […]

Wget option to download patches from Metalink

Recently I pathced Grid control to 10.2.0.5, while downloading patch from metalink I noticed WGET option to download patch directly on server. It gives option to download a script or copy to clipboard. After minor modifications related to password (location/proxy if required ) script is ready to execute. Started in nohup Not a bad option […]

Oracle GoldenGate Trail File Size

When it comes to estimating how big your trail files will be, Oracle documentation suggests to use the following formula:

[log volume in one hour] x [number of hours downtime] x .4 = trail disk space

It is also described as being a conservative estimate so you are likely to archive a better mileage. What I found is that there are some corner cases which can produce some “anomaly” results.

Test Case

I’ll start with a specially crafted schema and table names as well as data:

create user uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu identified by "u";

User created.

SQL> grant resource to uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu;

Grant succeeded.

SQL> create table uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu.tttttttttttttttttttttttttttttt
2 (
3 n number primary key
4 );

Table created.

SQL> alter table uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu.tttttttttttttttttttttttttttttt
2 add supplemental log data (primary key) columns;

Table altered.

I’ve just created a new trail file which is currently 974 bytes in size:

[oracle@gg1 dirdat]$ ls -l aa000000
-rw-rw-rw- 1 oracle oinstall 974 Dec 19 11:50 aa000000

What I’m going to do is insert some data into a table and then measure how much redo as well as trail data were generated:

SQL> set autot traceonly stat
SQL> insert into uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu.tttttttttttttttttttttttttttttt
select power(10,10)+(level-1)*power(10,10)
from dual
connect by level
10000 rows created.


Statistics
----------------------------------------------------------
0 recursive calls
329 db block gets
31 consistent gets
0 physical reads
364612 redo size
821 bytes sent via SQL*Net to client
917 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
10000 rows processed

SQL> commit;

Commit complete.

Let’s take a look at the trail file size:

[oracle@gg1 dirdat]$ ls -l aa000000
-rw-rw-rw- 1 oracle oinstall 1619890 Dec 19 13:40 aa000000

That is roughly 4.4 times bigger than our redo size and 11 times bigger than Oracle’s “conservative” estimate. Your storage provisioning might be in for a surprise.

Trail File

To understand the key factors which resulted in our trail file to be of such outrageous size let’s use logdump and take a look inside:

Logdump 11 >open ./dirdat/aa000000
Current LogTrail is /u01/app/oracle/ggs/dirdat/aa000000
Logdump 12 >ghdr on
Logdump 13 >next 2

2010/12/19 13:39:16.632.818 FileHeader Len 966 RBA 0
Name: *FileHeader*
3000 01b6 3000 0008 4747 0d0a 544c 0a0d 3100 0002 | 0...0...GG..TL..1...
0002 3200 0004 2000 0000 3300 0008 02f1 bdfb 2d3e | ..2... ...3.......->
74f2 3400 0028 0026 7572 693a 6767 313a 7175 6164 | t.4..(.&uri:gg1:quad
726f 3a63 6f6d 3a3a 7530 313a 6170 703a 6f72 6163 | ro:com::u01:app:orac
6c65 3a67 6773 3600 0025 0023 2f75 3031 2f61 7070 | le:ggs6..%.#/u01/app
2f6f 7261 636c 652f 6767 732f 6469 7264 6174 2f61 | /oracle/ggs/dirdat/a
6130 3030 3030 3037 0000 0101 3800 0004 0000 0000 | a0000007....8.......

___________________________________________________________________
Hdr-Ind : E (x45) Partition : . (x04)
UndoFlag : . (x00) BeforeAfter: A (x41)
RecLength : 19 (x0013) IO Time : 2010/12/19 13:40:32.000.000
IOType : 5 (x05) OrigNode : 255 (xff)
TransInd : . (x00) FormatType : R (x52)
SyskeyLen : 0 (x00) Incomplete : . (x00)
AuditRBA : 45 AuditPos : 37458516
Continued : N (x00) RecCount : 1 (x01)

2010/12/19 13:40:32.000.000 Insert Len 19 RBA 974
Name: UUUUUUUUUUUUUUUUUUUUUUUUUUUUUU.TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
After Image: Partition 4 G b
0000 000f 0000 000b 3130 3030 3030 3030 3030 30 | ........10000000000

The first record is a standard trail file header and is of little interest to us. We’re going to take look at the second record. First of all, as you can see, we’ve got a fully qualified table name stored in there. We can see the inserted value as well. Now if all that stuff got stored in plain text that could explain it… let’s check it out:

[oracle@gg1 dirdat]$ grep --binary-files=text -o -i \
> uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu.tttttttttttttttttttttttttttttt \
> aa000000 \
> | wc -l
10000

Indeed, we’ve got a fully qualified table name appearing 10 thousand times on our trail file! The longer your schema and/or your table names are the bigger your trail file will be. The main reason Oracle Streams are using object identifiers (as well as Oracle’s redo) is to avoid getting into exactly this kind of trouble. This makes GoldenGate configuration easier as it doesn’t have to store the mapping information on the target database but you also have to pay the price with every row modification which gets captured.

The story doesn’t end there, however. Let’s take a look at the inserted data. How do you think that got stored?

[oracle@gg1 dirdat]$ grep --binary-files=text -o -i \
> 0000000000 \
> aa000000 \
> | wc -l
10000

That’s right – all numbers got stored in plain text. The fundamental problem here is that trail file does not store data types. Everything just gets converted to strings. This helps dealing with heterogeneous data sources but also makes the storage for certain data types to be very inefficient. Again, you have to pay the price somewhere. If you gzip the above trail file you’ll get a whopping 25x compression ratio.

You may ask yourself how column names are being stored? The answer is there are no column names in the trail file, just column positions. That’s why GoldenGate requires column order to match between source and target databases, otherwise you’ll get your data mapped to a wrong column. Add a lack of data types and you might be in for a very interesting results. If column order doesn’t match then you’ll have to use defgen to get your data across properly.

In a nutshell, watch out for tables/schemas with long names and lots of numeric data.

Oracle GoldenGate Trail File Size

When it comes to estimating how big your trail files will be, Oracle documentation suggests to use the following formula:
[log volume in one hour] x [number of hours downtime] x .4 = trail disk spaceIt is also described as being a conservative estimate…

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569