Please need your help in understanding when to collect stats.
I know if the no. of count in a table is increased by 10%, we need to collect stats.
How to validate the count has been increased or there's no change in the table.
The best practice is to collect the stats once your data load is done.You do not have to calculate the % of data changed.
in fact YEs.. collect stats shd be taken when a certain % of data changes.
if you know its very few you donot have to take it. But make it a best practice to take stats in a specific interval for effictive plan generation by the parser.
I would say, better have a practice of collecting stats after the table load has done.
Count shd be validated by b4 and after.But if the change is by updating the key columns this wont be suitable.So keep taking stats on a regular interval(weekly or monthly) depending on the table load.
It is a complicated problem, and there are different strategies based on data type, frequency and usage.
Some of the stats are need to be collected right after load, some of them is enough daily (in offpeak hours), and some of them can be recollected even less frequently.
You can waste a lot of resource on collecting unnecessary stats, so be careful!
I would recommend PRISE StatMan to automate statistics collection, you can get a full functional free trial here: https://www.prisetools.com/productrequest
I have a table of around 5M rows and in that one column is used frequently in where clause.
But that one column has only 2 possible values. It can either be null or 'Alpha'. Is it recommended to collect stats on this column or not?
I have a few recomnedations:
Statistics is highly recommended if frequently used in where clauses.
The neccessity of re-collecting the statistics is basically determined by 2 factors:
* Does the table size change significantly? (Teradata suggests +-10% record count change as a trigger)
* Does the Null / 'Alpha' ratio change significantly?
As this statistics will lay on low cardinality column its cost is not too high, you can collect regularly (eg. weekly). Concerning Full / Sample question I would use sample statistics, however in earlier versions (V12) I experienced strange results in case of low cardinality cases.
If you use management tool hopefull it will contain those heuristics that determines if statistics actually has to be refreshed or not.