Collect Stats

Database
Enthusiast

Collect Stats

Hi all,

1: Once a collect stats is done on the table(on index or column) where is this information stored so that the optimizer can refer this?

2: How often collect stats has to be made for a table that is frequently updated?

3: Once a collect stats has been done on the table how can i be sure that the optimiser is considering this before execution ? i.e; untill the next collect stats has been done will the optimiser refer this?

4: How can i know the tables for which the collect stats has been done?

5: To what extent will there be performance issues when a collect stats is not done?Can a performance issue be related only due to collect stats? Probably a HOT AMP could be the reason for lack of spool space which is leading to performance degradation !!!

6: Also let me know what can lead to lack of spool space apart from HOT AMP !!!

Answers awaited!!!
Thanks,
abc

12 REPLIES
Enthusiast

Re: Collect Stats

Here is my answers, hope that helps:

1: Once a collect stats is done on the table(on index or column) where is this information stored so that the optimizer can refer this?

Ans: Collected statistics are stored in DBC.TVFields or DBC.Indexes. However, you cannot query these two tables.

2: How often collect stats has to be made for a table that is frequently updated?

Answer: You need to refresh stats when 5 to 10% of table's rows have changed. Collect stats could be pretty resource consuming for large tables. So it is always advisable to schedule the job at off peak period and normally after approximately 10% of data changes.

3: Once a collect stats has been done on the table how can i be sure that the optimiser is considering this before execution ? i.e; untill the next collect stats has been done will the optimiser refer this?

Ans: Yes, optimizer will use stats data for query execution plan if available. That's why stale stats is dangerous as that may mislead the optimizer.

4: How can i know the tables for which the collect stats has been done?

Ans: You run Help Stats command on that table. e.g HELP STATIISTICS TABLE_NAME ; this will give you Date and time when stats were last collected. You will also see stats for the columns ( for which stats were defined) for the table. You can use Teradata Manager too.

5: To what extent will there be performance issues when a collect stats is not done?Can a performance issue be related only due to collect stats? Probably a HOT AMP could be the reason for lack of spool space which is leading to performance degradation !!!

As: 1stpart: Teradata uses a cost based optimizer and cost estimates are done based on statistics. So if you donot have statistics collected then optimizer will use a Dynamic AMP Sampling method to get the stats. If your table is big and data was unevenly distributed then dynamic sampling may not get right information and your performance will suffer.

2nd Part: No, performance could be related to bad selection of indexes ( most importantly PI) and the access path of a particular query.

6: Also let me know what can lead to lack of spool space apart from HOT AMP !!!

Ans: One reason comes to my mind, a product join on two big data sets may lead to the lack of spool space.
Enthusiast

Re: Collect Stats

Thanks Somesh,
Things were clear.
can you please tell how to check the Collect stats using Teradata Manager???

Regards,
abc
Enthusiast

Re: Collect Stats

Also,
Let me know...
once a collect stats is done are all those records stored somewhere? i.e If a collect stats on a table is resulted with 10 uinique values...are these unique values stored somwhere?

What is stored in DBC.tvfields/dbc.indexes is just a count of the values right ? If all this is stored then its using up lots of space ....?

Thanks,
abc
Enthusiast

Re: Collect Stats

Hi,

Can a collect stats be done at the database level?
If so how....?
Which table will have an entry of the same?
Or atleast can a DBA do that?
Any previlage required?

Thanks,
abc
Enthusiast

Re: Collect Stats


I think, we can’t do a collect statistics on database level, It can only be done on indexs and columns in a table…

Regards,
Balamurugan
Enthusiast

Re: Collect Stats

See my answers... hope that helps

Question:
Also,
Let me know...
once a collect stats is done are all those records stored somewhere? i.e If a collect stats on a table is resulted with 10 unique values...are these unique values stored somwhere?

Answer: No. data is still in the base table. Collect stats just derives the data demographics of the table. It collects the information like: total row counts of the table, how many distinct values are there in the column, how many rows per value, is the column indexed, if so unique or non unique etc.

Question:
What is stored in DBC.tvfields/dbc.indexes is just a count of the values right ? If all this is stored then its using up lots of space ....?

Answer: Right. Space requirement is dependent on how many columns, indexes you have defined for collect stats. I donot think this space is a matter of concern.

Question:
Can a collect stats be done at the database level?
If so how....?
Which table will have an entry of the same?
Or atleast can a DBA do that?
Any previlage required?

Ans: No. You cannot collect stats at database level. See Balamurugan's answer. Forget at database level, for bigger tables you got to be careful about number of columns and number of Indexes you really want to collect stats on. Because of time and resources this task takes, people sometime resorts to collect stats on a smaller sample of table rows instead of the all the rows in the table.
Enthusiast

Re: Collect Stats

Hi all,

Thanks for your answers.

Is there any way by which we can estimate the time taken to do the collect stats ..

Is it that we can use the collect stats wizard to do the same?
Are both of them (collect stats sql statement and collect stats wizard) generating the same result? and consume the same time?

Thanks,
abc

Re: Collect Stats

Is there a way in Teradata to only collect stale statistics or stats that are out of date? In Oracle, using dbms_stats package you can tell the optimizer to only collect stale stats...
Teradata Employee

Re: Collect Stats

Hi,

The time can be estimated over data-volume, data-type of column and count of columns. But off-course they can't be very accurate.

I believe, there isn't any way to do 'collect stale stats' as yet, perhaps because of Teradata's DWH nature and not OLTP.

Regards,

MAC