Physical Implemetation of the Tables - PPI

Database
Enthusiast

Physical Implemetation of the Tables - PPI

Earlier our database were having the follwing table with the partition expression

PARTITION BY RANGE_N(PI_COLUMN between DATE '2000-01-01' AND '2100-01-01' EACH INTERVAL '1' MONTH )

The queries running on this were taking lot of time

But i have done a testing on my Test DB i copied the structure and modified the partitioning expression
and i got tremendous improvement .
PARTITION BY RANGE_N(PI_COLUMN between DATE '2000-01-01' AND '2001-01-01' EACH INTERVAL '1' MONTH )
DATE '2001-01-02' AND '2002-01-01' EACH INTERVAL '1' MONTH ,
DATE '2002-01-02' AND '2003-01-01' EACH INTERVAL '1' MONTH ,
DATE '2003-01-02' AND '2004-01-01' EACH INTERVAL '1' MONTH ,
DATE '2004-01-02' AND '2005-01-01' EACH INTERVAL '1' MONTH ,
DATE '2005-01-02' AND '2006-01-01' EACH INTERVAL '1' MONTH ,
DATE '2006-01-02' AND '2007-01-01' EACH INTERVAL '1' MONTH ,
DATE '2007-01-02' AND '2008-01-01' EACH INTERVAL '1' MONTH ,
DATE '2008-01-02' AND '2009-01-01' EACH INTERVAL '1' MONTH ,
DATE '2009-01-02' AND '2010-01-01' EACH INTERVAL '1' MONTH , NO RANGE OR UNKNOWN

I am planning to apply the same to my other tables

But i have certain questions

Suppose i have kep future partitions will affect my database performance?
And suppose PPI table is undergoing monthly load whether will affect the same?
what are the maintenance cost ?

19 REPLIES
Enthusiast

Re: Physical Implemetation of the Tables - PPI

A few things to consider would be.

Are there statistics collected on PARTITION (not the paritioning column, but the PARTITION itself) ?

Does your query plan talk about DPE ? (in prod / test ).

Empty partitions are not "maintained" any where, so I would presume it shouldn't add up to maintenance cost.

You always have the option of adding ranges to the end of the partition as part of the load job. which is what I've seen in common practice.

Too many (how many is debatable) empty partitions tend to sometimes confuse the optimizer. (There's no way optimizer can make sure the partition is empty. A row could have gone in between the collection of stats and generation of query plan, so it kind of tends to make a "broader" query plan).

it would worth while to compare the query plan in both environments. (with and without the statistics collected on partition). that might shed some light on what's happening different.
Enthusiast

Re: Physical Implemetation of the Tables - PPI

Thanks

for reply ,

I have check even though statistics collect on this column which has single partition it was taking tremendous time.

But in the multiple partitioning i havenot yet collected any stats and the results are very much faster.

During the load operation supoose my dt column having multiple partitions can accept the null values.?

Sushant
Enthusiast

Re: Physical Implemetation of the Tables - PPI

In your original DDL, you have about 1200 partitions (12 months * 100 yrs)
where as in your modified DDL you have only about 121 partitions (12 months * 10 yrs + 1)

that in itself is a big difference. TD tend to not to opt for DPE if the number of partitions are huge.

The right thing to do is look into your explains to see if DPE is happening.

Also you need to collect stats on the partition itself, not on the partitioning column.

I didn't quite get your last Q, if you meant whether you can load data into multiple partitions, yes that shouldn't be an issue.
Enthusiast

Re: Physical Implemetation of the Tables - PPI

In te old scenario i was having only 1 partition having the all the rows

and in the new ddl i am having different partitions holding date

so in the earlier DDl where i was having the single partition it was scanning only partition 1

so i have used

The below select executed on OLD DDL
SELECT Partition,Col_Name,Count(*) from PARTITION_TABLE_NAME group by 1,2

Partition Col_Name Count(*)
1 Col 100000
1 Col 100000
1 col 100000
...
...
So On

The below select executed on NEW DDL
SELECT Partition,Col_Name,Count(*) from PARTITION_TABLE_NAME group by 1,2

Partition Col_Name Count(*)
2 col 100000
3 col 50000
4 col 20000
...
..
SO ON

So as partition boundaries startare well defined the optimizer is performing the search very quick
and as the number of rows are less within the partition is searches less blocks and gives the result faster .

Well u are asking me to check the explain in both OLD and NEW DDL

Both explains were showing me the Single partition phrase.

Enthusiast

Re: Physical Implemetation of the Tables - PPI

Sorry Joe , I forgot to explan about the last question along with the earlier thread

Does the PPI column can be null?

Also can you please put more insight on collection of statistics at partition level
Enthusiast

Re: Physical Implemetation of the Tables - PPI

I believe your original DDL was (as you first posted)

PARTITION BY RANGE_N(PI_COLUMN between DATE '2000-01-01' AND '2100-01-01' EACH INTERVAL '1' MONTH )

That's 1201 partitions ... I am not sure why you call it a single partition :o. Basically you are stating that PI_Column is a date between 2000-01-01 and 2100-01-01 and each month is a partition ( so it's like 1201 partitions because there are that many months between those two dates.)

Your query to compute records per partition is not the correct one. it should be

SELECT PARTITION, COUNT(*)
FROM tablename
GROUP BY 1
ORDER BY1;

And yes you can have nulls on PPI column as long as you have included the UNKOWN partition in your partitioning expression.

to collect stats on your partition this is the syntax.

COLLECT STATISTICS ON mytable COLUMN(PARTITION);

just replace mytable with the actual table name ... do not change the PARTITION word to PPI column, it's a keyword which states that you want the statistics to be collected on the partition itself.
Enthusiast

Re: Physical Implemetation of the Tables - PPI

But then why there is tremendous increase in the performance of the query in the NEW PPI structure and why the queries are taking time in the earlier structure?

Junior Contributor

Re: Physical Implemetation of the Tables - PPI

Hi Sushant,
there's no reason to rewrite that to a single range to multiple ranges, because the internal partitioning doesn't change.
If there's really only a single partition for the first example, then you made a mistake.
Could you post the real DDL (PI plus Partitioning) for both tables.

Dieter

Enthusiast

Re: Physical Implemetation of the Tables - PPI

Hi Dieter,

I am attaching the new and old DDL . Please help me in finding out the best possible solution. I tried certain test for accessing the data . I am getting the good performance in the Newly created table. If from your point of view it is the mistake then the TD docs are mentioning about the multiple raneges partitions which give better performance. And my queries on the particular table also getting executed faster.

CREATE MULTISET TABLE Dummy_NEW ,NO FALLBACK
(
COLUMN_OCD_J INTEGER COMPRESS (43 ,44 ),
COLUMN_ORGTG_OCN_J INTEGER COMPRESS (9257 ,2202 ),
COLUMN_DSTNN_OCN_J INTEGER COMPRESS (9257 ,355 ),
COLUMN_DOMCL_OCN_J INTEGER COMPRESS (8709 ,9257 ,815 ,9521 ,12602 ,355 ,8700 ),
COLUMN_OTT_J INTEGER COMPRESS (3370 ,3131 ,3148 ,3175 ,1898 ,1912 ,2965 ,2975 ,2998 ,3018 ,3025 ,3287 ,1261 ,1271 ,1275 ),
COLUMN_ORGTG_OFI_J INTEGER COMPRESS (17 ,18 ,30 ,31 ,176 ,200 ,205 ),
COLUMN_DSTNN_OFI_J INTEGER COMPRESS (17 ,18 ,30 ,31 ,200 ,203 ,205 ),
COLUMN_OPD_J INTEGER COMPRESS (587 ,132 ,1969 ),
COLUMN_OMS_J INTEGER
COLUMN_OCU_J INTEGER
COLUMN_OTH_J INTEGER
COLUMN_OPT_J INTEGER
COLUMN_OEW_J INTEGER
COLUMN_ODV_J INTEGER
COLUMN_ODT_J INTEGER,
COLUMN_OEB_J INTEGER COMPRESS (84 ,85 ,86 ,87 ,88 ,93 ,149 ),
COLUMN_A DECIMAL(15,2)
COLUMN_ACTL_ESCST_A DECIMAL(15,2) COMPRESS (0.00 ,5.15 ,0.04 ,2.88 ,0.34 ,3.45 ,1.11 ,11.35 ,4.01 ,9.37 ,6.92 ,1.99 ,12.54 ,2.41 ),
COLUMN_BUDGT_ESCST_A DECIMAL(15,2) COMPRESS 0.00 ,
COLUMN_ACTL_PTREV_A DECIMAL(15,2) COMPRESS 0.00 ,
COLUMN_BUDGT_PTREV_A DECIMAL(15,2) COMPRESS 0.00 ,
COLUMN_ORGTG_ACNT_N CHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS '0000000000000000',
COLUMN_DSTNN_ACNT_N CHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS '0000000000000000',
COLUMN_ORGTG_CARD_N CHAR(19) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS '0000000000000000000',
COLUMN_OSC_J INTEGER COMPRESS 539 ,
COLUMN_OCV_J INTEGER COMPRESS (0 ,1 ,2 ,3 ,4 ,5 ,6 ),

COLUMN_OAA_J INTEGER)
PRIMARY INDEX ( COLUMN_CRETN_ODT_J ,COLUMN_SEQ_N ,COLUMN_PARTN_KEY_1_X )
PARTITION BY RANGE_N(COLUMN_PARTN_KEY_1_X BETWEEN DATE '2000-01-01' AND DATE '2002-01-01' EACH INTERVAL '1' MONTH ,
DATE '2002-01-02' AND DATE '2004-01-01' EACH INTERVAL '1' MONTH ,
DATE '2004-01-02' AND DATE '2006-01-01' EACH INTERVAL '1' MONTH ,
DATE '2006-01-02' AND DATE '2008-01-01' EACH INTERVAL '1' MONTH ,
DATE '2008-01-02' AND DATE '2010-01-01' EACH INTERVAL '1' MONTH ,
DATE '2010-01-02' AND DATE '2012-01-01' EACH INTERVAL '1' MONTH ,
DATE '2014-01-02' AND DATE '2016-01-01' EACH INTERVAL '1' MONTH ,
DATE '2016-01-02' AND DATE '2018-01-01' EACH INTERVAL '1' MONTH ,
DATE '2018-01-02' AND DATE '2020-01-01' EACH INTERVAL '1' MONTH ,
NO RANGE OR UNKNOWN);

CREATE MULTISET TABLE prododss.O_DUMMY_ORGINAL ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
FREESPACE = 7 PERCENT,
CHECKSUM = DEFAULT
(
COLUMN_O_J INTEGER COMPRESS (43 ,44 ),
COLUMN_O_OCN_J INTEGER COMPRESS (9257 ,2202 ),
COLUMN_D_OCN_J INTEGER COMPRESS (9257 ,355 ),
COLUMN_D_OCN_J INTEGER COMPRESS (8709 ,9257 ,815 ,9521 ,12602 ,355 ,8700 ),
COLUMN_OJ INTEGER COMPRESS (3370 ,3131 ,3148 ,3175 ,1898 ,1912 ,2965 ,2975 ,2998 ,3018 ,3025 ,3287 ,1261 ,1271 ,1275 ),
COLUMN_O_OFI_J INTEGER COMPRESS (17 ,18 ,30 ,31 ,176 ,200 ,205 ),
COLUMN_D_OFI_J INTEGER COMPRESS (17 ,18 ,30 ,31 ,200 ,203 ,205 ),
COLUMN_O_J INTEGER COMPRESS (587 ,132 ,1969 ),
COLUMN_O_J INTEGER COMPRESS (1 ,3 ,5 ,8 ,14 ,37 ,45 ),
COLUMN_O_J INTEGER COMPRESS 12545332 ,
COLUMN_O_J INTEGER COMPRESS 1 ,
COLUMN_O_J INTEGER COMPRESS (0 ,1 ,2 ,3 ,4 ,6 ,7 ),
COLUMN_O_J INTEGER COMPRESS (0 ,3083 ,3877 ,3883 ,3278 ,3280 ,3063 ),
COLUMN_O_J INTEGER COMPRESS 3585 ,
COLUMN_O_J INTEGER,
COLUMN_O_J INTEGER COMPRESS (84 ,85 ,86 ,87 ,88 ,93 ,149 ),
COLUMN_A DECIMAL(15,2) COMPRESS (0.00 ,100.00 ,200.00 ,300.00 ,500.00 ,50.00 ,1000.00 ),
COLUMN_A DECIMAL(15,2) COMPRESS (0.00 ,5.15 ,0.04 ,2.88 ,0.34 ,3.45 ,1.11 ,11.35 ,4.01 ,9.37 ,6.92 ,1.99 ,12.54 ,2.41 ),
COLUMN_B_A DECIMAL(15,2) COMPRESS 0.00 ,
COLUMN_A_A DECIMAL(15,2) COMPRESS 0.00 ,
COLUMN_B_A DECIMAL(15,2) COMPRESS 0.00 ,
COLUMN_O_ACNT_N CHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS '0000000000000000',
COLUMN_D_N CHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS '0000000000000000',

COLUMN_OAA_J INTEGER)
PRIMARY INDEX ( COLUMN_o_J ,COLUMN_SE_N ,COLUMN_PARTN_KEY_1_X )
PARTITION BY RANGE_N(COLUMN_PARTN_KEY_1_X BETWEEN DATE '2000-01-01' AND DATE '2100-01-01' EACH INTERVAL '1' MONTH ,
NO RANGE OR UNKNOWN);