system-determined change threshold for Stats Collection

Database
Enthusiast

system-determined change threshold for Stats Collection

We are collecting stats on a table , however the stats collection is getting skipped.

This is confirmed from the StatsSkipCount flag in DBC.StatsV for the table.

On going for explain for the stats collect statement, we find something like:

3) We SKIP collecting STATISTICS for ('mth_id,col1,col2'),

     because the estimated data change of 0.90 % does not exceed the

     system-determined change threshold of 12.84 %.

  4) We SKIP collecting STATISTICS for ('mth_id,user_id,purchase_id '),

     because the estimated data change of 0.90 % does not exceed the

     system-determined change threshold of 5.08 %.

  5) We SKIP collecting STATISTICS for ('PARTITION,mth_id,user_grp '),

     because the estimated data change of 0.81 % does not exceed the

     system-determined change threshold of 12.83 %.

I believe the "system-determined change threshold" is defined in DBS control parameter.

Just wanted to know , how "system-determined change threshold" is different for different columns ?

2 REPLIES

Re: system-determined change threshold for Stats Collection

Hi,

I am facing the same issue.

Did you get an answer for it?

Enthusiast

Re: system-determined change threshold for Stats Collection

Hi Kishore/Chirag,

Cut from one of the Carrie's blog. Check this link for detailed Statistics Threshold Functionality on TD 14.10.

https://developer.teradata.com/blog/carrie/2014/02/statistics-threshold-functionality-101

All 14.10 systems have the system threshold functionality turned on by default.  But by itself, that is not enough.   USECOUNT logging for the database must also be enabled.   If USECOUNT DBQL logging is turned on, then each COLLECT STATISTICS statement will be evaluated to see if it will run or be skipped.    

During this evaluation, an appropriate change threshold for the statistic is established by the optimizer.  The degree of change to the table since the last collection is compared against the current state of the table, based on USECOUNT tracking of inserts, deletes and updates performed.   If the change threshold has not been reached, and enough history has been collected for this statistics (usually four of five full collections) so that the optimizer can perceive a pattern in the data such that extrapolations can be confidently performed, then this statistics collection will be skipped.  

Even if the percent of change threshold has not been reached (indicating that statistics can be skipped), if there are insufficient history records, the statistics will be recollected.  And even with 10 or 20 history records, if there is no regular pattern of change that the optimizer can rely on to make reasonable extrapolations, statistics will be recollected.

There is a DBS Control record parameter called SysChangeThresholdOption which the behavior of the system threshold functionality.  This parameter is set at zero by default.  Zero means that as long as USECOUNT logging in DBQL is enabled for the database that the table belongs to, then all statistics collection statements will undergo a percent of change threshold evaluation, as described above.

If you want to maintain the legacy behavior, threshold logic can be turned off completely at the system level by disabling the SysChangeThresholdOption setting in DBS Control (set it to 3).  This field, along with parameters to set DBA-defined global parameters, can be found in the new Optimizer Statistics Fields in DBS Control.  

It is important to re-emphasize that the DBQL USECOUNT logging must be enabled for all databases that you want to take advantage of the system threshold functionality.  In addition, all other lower-level threshold settings must remain off (as they are by default) in order for the system threshold to be in effect.

 

Thanks,

Dinesh