I am pretty new to teradata. I am creating a simple data cube based on an existing transac table in TD12 and i have been told to collect the stats on a few tables i have created.
So i am using the below query to look for the stats
HELP STATISTICS D_TEST ;, but it is coming up as NO STATS DEFINED FOR THIS TABLE.
Not sure if have to define the statics once i create any table,if so how do i go about doing it, please shed some light.
You need to select which columns would be used frequently in JOINs and then you need to collect stats using below command:
COLLECT STATISTICS ON <Database Name> COLUMN (<Column Name>);
Also .... in case you have some queries handy .... which will be frequently executed .... you can execute following command in a session:
DIAGNOSTIC HELPSTATS ON FOR SESSION;
And then get the EXPLAIN (F6 key) for getting the explain of the query and it will suggest you some stats to collect. You do need to execute those statements to be able to be in affect.
Also note, this command only gives you a hint of stats to be collected. Best case may not be as same as the suggestions.
Thanks for clarification in the previous posts.
I have used the belwo SQL to get the stats
COLLECT STATISTICS ON TEST COLUMN (TEST1);
and 2 rows have been processed, how do i go about looking at the stats data, where will they be stored?
You can use HELP STATISTICS <TableName> or from Teradata Administrator you can right-click on a table and click on 'Statistics' to get the stats that are collected.
It will list down columns on which stats have been collected and the unique values that are present in that column.
Same information is stored in DBC.StatsTbl, but you should prefer using the above 2 methods.
It is working, but only one column at a time. Is there a way we can set stats on a whole table at a single shot. I have tried something like
COLLECT STATISTICS ON <DB.Table Name>, but it is not working, i had to set up one column at a time and it is really time consuming.Please suggest!
You don't want/need to collect stats on each and every column because they must be re-collected from time to time (which is resource-intensive).
That's why you need to know/decide which columns need stats.
Collecting stats is a step-by-step process, and there is no 100% correct way of doing it. There can be a scenario which works best for you today .... but may not work the same way after some time [due to changed data dynamics].
Hence its a continous process and that is why it is on DBAs [usually] to track and add/remove COLLECT STATISTIC commands.
Collecting stats on all columns will not ensure your query will now run in a most optimized manner .... rather only collecting correct stats can ensure that.