Collecting Stats

General
Enthusiast

Collecting Stats

Hi Guys,

Please need your help in understanding when to collect stats.

I know if the no. of count in a table is increased by 10%, we need to collect stats.

How to validate the count has been increased or there's no change in the table.

Thank you.

10 REPLIES
Teradata Employee

Re: Collecting Stats

The best practice is to collect the stats once your data load is done.You do not have to calculate the % of data changed.

_aravind

Teradata Employee

Re: Collecting Stats

in fact YEs.. collect stats shd be taken when a certain % of data changes.

if you know its very few you donot have to take it. But make it a best practice to take stats in a specific interval for effictive plan generation by the parser.

Enthusiast

Re: Collecting Stats

Thanks for the reply aravind.

How to validate the count has been increased or there's no change in the table.

Thanks,

Teradata Employee

Re: Collecting Stats

I would say, better have a practice of collecting stats after the table load has done.

Count shd be validated by b4 and after.But if the change is by updating the key columns this wont be suitable.So keep taking stats on a regular interval(weekly or monthly) depending on the table load.

-Aravind

Enthusiast

Re: Collecting Stats

It is a complicated problem, and there are different strategies based on data type, frequency and usage.

Some of the stats are need to be collected right after load, some of them is enough daily (in offpeak hours), and some of them can be recollected even less frequently.

You can waste a lot of resource on collecting unnecessary stats, so be careful!

I would recommend PRISE StatMan to automate statistics collection, you can get a full functional free trial here: https://www.prisetools.com/productrequest

Enthusiast

Re: Collecting Stats

I have a table of around 5M rows and in that one column is used frequently in where clause.

But that one column has only 2 possible values. It can either be null or 'Alpha'. Is it recommended to collect stats on this column or not?

Enthusiast

Re: Collecting Stats

Generally speaking; Yes you should collect stats on the column, so that optimiser is aware of the low cardinality of the column

You can also compare Explain plan in both cases -- In specific situations, it may not be required, in case there is no difference in join stratergy, redistribution etc




Enthusiast

Re: Collecting Stats

Fahad,

I have a few recomnedations:

  1. Try to collect stats always, as it helps the optimizer to choose best plans for query execution.
  2. As the column has only two possible values, you can use compression on the column and for 5 million rows it will save space and improve processing.
  3. as the column is being use in where clause frequantly but the values are not unique, you can crate a NUSI to improve access.
Khurram
Fan

Re: Collecting Stats

Statistics is highly recommended if frequently used in where clauses.

The neccessity of re-collecting the statistics is basically determined by 2 factors:

* Does the table size change significantly? (Teradata suggests +-10% record count change as a trigger)

* Does the Null / 'Alpha' ratio change significantly?

As this statistics will lay on low cardinality column its cost is not too high, you can collect regularly (eg. weekly). Concerning Full / Sample question I would use sample statistics, however in earlier versions (V12) I experienced strange results in case of low cardinality cases.

If you use management tool hopefull it will contain those heuristics that determines if statistics actually has to be refreshed or not.

Ákos