I observed '*' as partition end point in PPI definition of few tables (please refer below in red colour). Can you please explain me what does it mean. Also, Is it a good practice to define '*' as end of partition. What are the implications of defining this. A quick response will be helpful. Many Thanks.
PARTITION BY RANGE_N(Trx_dt BETWEEN DATE '2010-01-01' AND * EACH INTERVAL '1' DAY )
In terms of what it means, this is directly from the manual: Use an asterisk ( * ) for the ending boundary of the last range in the list to indicate the highest possible value
(all values and NULL are less than an ending boundary specified as an asterisk).
So that definition allows for all valid dates recognised by Teradata including and after 1st January 2010 (i.e. everything up to and including 31 December 9999).
Personally I've never used that notation (or seen it used by anyone else) but it is hard to find a problem with it. I think it really comes down to how good your data is and what 'control' you want over it.
- There are no performance issues with empty partitions, so having partitions that allow for dates after most of us have retired is not an issue.
- If your data is of good quality (and assuming that you're loading 'current' date values) then you won't be getting spurious data values (e.g. a date in 100 years time) suddenly populating partitions.
- If your data quality is 'not so good' then having a coded end date and including a NO RANGE partition may allow you to identify such spurious date values easier.
Note that with the definition shown below explain plans may show access to 'a lot' of partitions (selecting "where Trx_Dt >= today", shows a retrieve "from 2915015 partitions") but remember that the dbms code knows if a partition is empty and will not attempt to to physically access any data in it.
So, is this 'good practise'? Hard to say. IMHO, it is certainly unusual.
Thanks for the quick response Dave.
I think in that case (partition end point 31st Dec 9999) there will be a small overhead of such a big partition range from the space point of view. 8 bytes will be used in ROW header instead of 2 bytes. Typically if I know my data I tried to keep the partition below 65635 (2 bytes overhead) whereever possible. For big table it matters (6 billion plus record).
Many Thanks again.
I've never seen it with INTERVAL, just "DATE '.....' AND *" or "* AND DATE '....'" to split NO RANGE in two partitions.
And ASingh is right, it wastes space.
Btw Dave, I like this euphemistic "allow for dates after most of us have retired", don't forget to turn off the light then :-)