I want to write TD-12 certification Exams.
Could you please share with me the details about Exam, preparation Material, Question Paper pattern, time, etc....
please share the certification material if you have any, my e-mail: email@example.com
Thanks & Regards
collect stats is very very import concept in teradata. with collect stats the optimizer desides the path for the data retrival(i.e low cost plan) and sends it to the amps.
not only collecting stats once . it should be refresh frequently when ever the data is insert into particular table. it is better to collect stats on columns.
if you collect stats on index level if once the indexs are deleted the stats also will be dropped. then the reterival of the data is slow.
syntax for collect stats is
collect stats on emp index(empid);
There is a mention of "least cost" in this discussion.
How do we calculate the cost of a Query? could it be calculated in numbers? then what is the unit of measurement?
Some doubts in statistics collection using Teradata Statistics Wizard:-
1. Stale stats are out-dated stats .... e.g stats are not collected for a long period of time while data has been changing in the table .... so we have stale stats that needs to be refreshed or recollected.
2. Ideally, for all tables which are part of workload.
3. No, stats collected for a particular query can't be stale for some queries as there 'stale-ness' depends on the data in the table, not the query.
4. Not exactly .... it is advisable to collect stats for all tables that are loaded frequently .... i.e. the tables which have changes in data frequently.
Thank you Adeel for your response...
Few more doubts:-
For example, I have two queries involving complex joins. Query 1 is dependent on tables A, B, C and D whereas query 2 is dependent on tables B, C and E.
Tables B and C both are common. But the columns of joins being used (of tables B and C) in query 1 are different from columns of joins being used (of tables B and C) in query 2.
If we get recommendations using Teradata Statistics Wizard for Query 1, it may be different for query 2. So on executing the recommendations for each query, we end up with statistics being collected on tables B and C on columns which are present in join conditions of Query 1 but not for query 2. In other words staistics collected for query 1 won't be useful for query 2 or vice-versa. Going forward we may have statistics collected on almost every column in the table.
Should we collect statistics for column involved in joins?
Would same affect the performance of the query?
Or we can just collect statistics on all tables in producton that are being loaded frequently, irresepective of the queries being fired in database. Because in one or other way the queries would fetch data from the tables only which would statistics collected on them?
Collection of unnessasary stats does have a downside, but considering your scenario, it shouldnt be an issue.
Moverover, it is always an evolving thing to come up with the best possible stats to collect. So, its better to analyze for a time-being and then tweak, add or remove stats.