The following DDL fails with: Failed [5714 : HY000] Invalid partitioning expression for PARTITION BY.
This table will only hold data <=180 days old.
CREATE SET TABLE test.test ( created_at TIMESTAMP(6) NOT NULL, cid VARCHAR(64)
) PRIMARY INDEX ( cid ) PARTITION BY RANGE_N(((cast(created_at as date) - DATE '1970-01-01') mod 180) between 0 and 179 each 1, unknown);
-- also tried: PARTITION BY (cast(created_at as date) - DATE '1970-01-01') mod 180;
This seems to meet all of the criteria here: http://www.info.teradata.com/HTMLPubs/DB_TTU_13_10/index.html#page/General_Reference/B035_1096_109A/...
except possibly the "Certain uses of EXTRACT, CAST..." part.
Is there a better way to partition for queries on days compared with created_at expressions?
Solved! Go to Solution.
Try the following:
PARTITION BY RANGE_N ( created_at BETWEEN TIMESTAMP '2001-01-01 00:00:00.000000' AND TIMESTAMP '2020-12-31 23:59:59.000000' EACH INTERVAL '1' DAY );
Remember that there is no performance penalty for having empty partitions, so defining a large range of dates is not an issue IF the partitions for anything other than your 180 days of data are really empty. the optimiser will know they are empty and will not search those partitions.
Your existing ETL code will need to purge old data to keep the 180 day rule - but it sounds like you're already doing that.
With the above in place queries such as the following will use partitioning.
SELECT * FROM t1 WHERE created_at BETWEEN DATE '2006-04-03' AND DATE '2007-01-01';
SELECT * FROM t1 WHERE created_at BETWEEN TIMESTAMP '2006-04-03 00:00:00' AND TIMESTAMP '2007-01-01 12:13:14';
Does that work for you?
Given what you said about no performance penalities for empty partitions, this solution will work. Thanks!
I'd still like to understand what the problem is with my PARTITION BY statements for my own TD education. My understanding that they need only deterministically return INTEGER.