When is the right time to refresh statistics? - Part II

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.

Continuing from Part I where we discussed the factors involved in determining when to refresh statistics, now we'll look at a process that I designed/created and have been using for a quite long time. This process may not fit into all environments but it will provide some ideas as to how to develop your own process. I call the process “Smart Stats.”

The Smart Stats process contains two stored procedures defined as follows:

  1. SP_SMART_STATS: Stored procedure to collect statistics on any table or join index
  2. SP_INDEX_STATS: Stored procedure to generate collect statistics statements at the index/column/multi-column level for any table or join index

The SP_SMART_STATS contains the following steps:

Let me provide more details of the SP_SMART_STATS steps. The SP_SMART_STATS logic was developed for a 32-bit system but I will provide a 32-bit and 64-bit versions. The 64-bit version needs to be tested because it has never been used before. The 32-bit version was used on V2R6.2 and TD12.

Disclaimer: This process uses the binary statistical field from DBC.IndexStats, DBC.ColumnStats, and  DBC.MultiColumnStats views and Teradata usually changes the disposition of the statistical information in the statitistical field from one release to another. This means that this process needs to be reviewed/changed after every Teradata software upgrade.

1. Global Temporary Table?

  • This is to be able to run the correct collect statistics statement for global temporary tables.
  • To determine if it is a global temporary table check if CommitOpt column is equal to ‘D’ or ‘P’ on DBC.TVM

(32-bit & 64-bit versions)

 SET WS_GT_FLAG = 'N';
LOCK ROW FOR ACCESS
SELECT 'Y'
INTO WS_GT_FLAG
FROM DBC.DBASE,
DBC.TVM
WHERE DBASE.DATABASEID = TVM.DATABASEID
AND DBASE.DATABASENAME = :dbname
AND TVM.TVMNAMEI = :tblname
AND TVM.TABLEKIND = 'T'
AND TVM.COMMITOPT in ('D', 'P');
IF SQLSTATE = '00000' AND WS_GT_FLAG = 'Y' THEN
LOCK ROW FOR ACCESS
SELECT 'COLLECT STATISTICS ON TEMPORARY '||TRIM(:DATABASENAME)||'.'||TRIM(:TABLENAME)||';'
INTO WS_STATS_TEXT
FROM DBC.TABLES
WHERE databasename = :dbname
AND tablename = :tblname;
END IF;

 2. Zero Stats?

  • This is to cover the case where a new table is created and statistics are defined but the statistics where never refreshed after the first load or it was partially refreshed.
  • To determine zero stats parse the statistic field (binary field) to extract the number of unique values as follows:
    • Parse the FieldStatistics column on DBC.ColumnStats to extract the NumValues of each column where FieldStatistics is not null.
    • Parse the ColumnsStatistics column on DBC.MultiColumnStats to extract the NumValues of each multi-column where ColumnsStatistics is not null.
    • Parse the IndexStatistics column on DBC.IndexStats to extract the NumValues of each index where IndexStatistics is not null.
    • if any NumValues is equal to zero, then collect statistics.

(32-bit version)

SET WS_COLSTS=0;
LOCK ROW FOR ACCESS
SELECT MIN(CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0 ELSE (-1**(NumValuesw1 / 32768)) * (2**((NumValuesw1/16 MOD 2048) - 1023)) * (1 + ((NumValuesw1 MOD 16) * 2**-4) + (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52)) END) (DECIMAL(18,0)) AS NumValues
INTO WS_COLSTS
FROM (SELECT CASE WHEN SUBSTR(STATS, 23, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(FIELDSTATISTICS, 1, 120) AS Stats
FROM DBC.COLUMNSTATS
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND FIELDSTATISTICS IS NOT NULL ) S2;

SET WS_MCOLSTS=0;
LOCK ROW FOR ACCESS
SELECT MIN(CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0 ELSE (-1**(NumValuesw1 / 32768)) * (2**((NumValuesw1/16 MOD 2048) - 1023)) * (1 + ((NumValuesw1 MOD 16) * 2**-4) + (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52)) END) (DECIMAL(18,0)) AS NumValues
INTO WS_MCOLSTS
FROM (SELECT CASE WHEN SUBSTR(STATS, 23, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(ColumnsStatistics, 1, 120) AS Stats
FROM DBC.MultiColumnStats
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND ColumnsStatistics is not null ) S2 ;

SET WS_IDXSTS=0;
LOCK ROW FOR ACCESS
SELECT MIN(CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0 ELSE (-1**(NumValuesw1 / 32768)) * (2**((NumValuesw1/16 MOD 2048) - 1023)) * (1 + ((NumValuesw1 MOD 16) * 2**-4) + (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52)) END) (DECIMAL(18,0)) AS NumValues
INTO WS_IDXSTS
FROM (SELECT CASE WHEN SUBSTR(STATS, 23, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(INDEXSTATISTICS, 1, 120) AS Stats
FROM DBC.INDEXSTATS
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND INDEXSTATISTICS IS NOT NULL ) S2;

IF (WS_COLSTS = 0 AND WS_COLSTS IS NOT NULL) OR
(WS_MCOLSTS = 0 AND WS_MCOLSTS IS NOT NULL) OR
(WS_IDXSTS = 0 AND WS_IDXSTS IS NOT NULL) THEN
CALL DW.SP_INDEX_STATS (:DATABASENAME, :TABLENAME,:Sqlmsg, :sqlerrcde, :sqlstte) ;

(64-bit version)

SET WS_COLSTS=0;
LOCK ROW FOR ACCESS
SELECT MIN(CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0 ELSE (-1**(NumValuesw1 / 32768)) * (2**((NumValuesw1/16 MOD 2048) - 1023)) * (1 + ((NumValuesw1 MOD 16) * 2**-4) + (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52)) END) (DECIMAL(18,0)) AS NumValues
INTO WS_COLSTS
FROM (SELECT CASE WHEN SUBSTR(STATS, 27, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34+14 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(FIELDSTATISTICS, 1, 130) AS Stats
FROM DBC.COLUMNSTATS
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND FIELDSTATISTICS IS NOT NULL ) S2;

SET WS_MCOLSTS=0;
LOCK ROW FOR ACCESS
SELECT MIN(CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0 ELSE (-1**(NumValuesw1 / 32768)) * (2**((NumValuesw1/16 MOD 2048) - 1023)) * (1 + ((NumValuesw1 MOD 16) * 2**-4) + (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52)) END) (DECIMAL(18,0)) AS NumValues
INTO WS_MCOLSTS
FROM (SELECT CASE WHEN SUBSTR(STATS, 27, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34+14 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(ColumnsStatistics, 1, 130) AS Stats
FROM DBC.MultiColumnStats
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND ColumnsStatistics is not null ) S2 ;

SET WS_IDXSTS=0;
LOCK ROW FOR ACCESS
SELECT MIN(CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0 ELSE (-1**(NumValuesw1 / 32768)) * (2**((NumValuesw1/16 MOD 2048) - 1023)) * (1 + ((NumValuesw1 MOD 16) * 2**-4) + (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52)) END) (DECIMAL(18,0)) AS NumValues
INTO WS_IDXSTS
FROM (SELECT CASE WHEN SUBSTR(STATS, 27, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34+14 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(INDEXSTATISTICS, 1, 130) AS Stats
FROM DBC.INDEXSTATS
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND INDEXSTATISTICS IS NOT NULL ) S2;

IF (WS_COLSTS = 0 AND WS_COLSTS IS NOT NULL) OR
(WS_MCOLSTS = 0 AND WS_MCOLSTS IS NOT NULL) OR
(WS_IDXSTS = 0 AND WS_IDXSTS IS NOT NULL) THEN
CALL DW.SP_INDEX_STATS (:DATABASENAME, :TABLENAME,:Sqlmsg, :sqlerrcde, :sqlstte) ;

3. Stats 90 Days Older?

  • This case is to cover rolling tables usually very large PPI tables where new partitions are added while old partitions are deleted and because of that it never triggers a refresh of the statistics based on 10% change in size. This case also covers almost static tables.
  • To determine if stats are 90 days older parse the statistic field (binary field) to extract the collect date as follows:
    • Parse the FieldStatistics column on DBC.ColumnStats to extract the CollectDate of each column where FieldStatistics is not null.
    • Parse the ColumnsStatistics column on DBC.MultiColumnStats to extract the CollectDate of each multi-column where ColumnsStatistics is not null.
    • Parse the IndexStatistics column on DBC.IndexStats to extract the CollectDate of each index where IndexStatistics is not null.
    • if the minimum CollectDate is less than Date – 90, then collect statistics.

(32-bit & 64-bit versions)

LOCK ROW FOR ACCESS
SELECT MIN(( (HASHBUCKET(SUBSTR(FIELDSTATISTICS, 2, 1) || SUBSTR(FIELDSTATISTICS, 1, 1) (BYTE(4)) )
/ ( (HASHBUCKET()+1)/65536) - 1900 ) * 10000
+ (HASHBUCKET('00'XB || SUBSTR(FIELDSTATISTICS, 3, 1) (BYTE(4)) ) )
/ ( (HASHBUCKET()+1)/65536) * 100
+ (HASHBUCKET('00'XB || SUBSTR(FIELDSTATISTICS, 4, 1) (BYTE(4)) ) )
/ ( (HASHBUCKET()+1)/65536) )) (DATE, FORMAT 'YYYY-MM-DD')
INTO WS_STATS_COLLECT_DATE
FROM (SELECT FIELDSTATISTICS
FROM DBC.COLUMNSTATS
WHERE FIELDSTATISTICS IS NOT NULL
AND DATABASENAME = :dbname
AND TABLENAME = :tblname
UNION
SELECT INDEXSTATISTICS
FROM DBC.INDEXSTATS
WHERE INDEXSTATISTICS IS NOT NULL
AND DATABASENAME = :dbname
AND TABLENAME = :tblname
UNION
SELECT COLUMNSSTATISTICS
FROM DBC.MULTICOLUMNSTATS
WHERE COLUMNSSTATISTICS IS NOT NULL
AND DATABASENAME = :dbname
AND TABLENAME = :tblname ) S1 ;

IF WS_STATS_COLLECT_DATE < Date - 90 THEN
CALL DW.SP_INDEX_STATS (:DATABASENAME, :TABLENAME,:Sqlmsg, :sqlerrcde, :sqlstte) ;

4. ETL Working Table?

  • This case is to cover staging tables and working tables associated with the ELT/ETL processes.
  • A TableSize History (PPI) table will need to be created and a daily process will need to load the data from DBC.TableSize into this historical table.
    • If there is no entry on TableSize_Hist table on the date that the statistics was last collected for the specific staging or working table, then collect statistics.
    • If there is a 10% change in size between CurrentPerm of the specific staging or working table from DBC.TableSize and PeakPerm from TableSize_Hist, then collect statistics.
    • Please notice that the logic is considering an increase or a decrease of 10% in size and it is checking the current perm space of DBC.TableSize  against the peak perm space of TableSize_Hist because the staging or working tables are usually empty when the process loads the information into TableSize_Hist table.

(32-bit & 64-bit versions)

IF dbname = 'WORKDB' THEN
SET WS_FLAG = 0;
LOCK ROW FOR ACCESS
SELECT 1
INTO WS_FLAG
FROM DBA_DB.TABLESIZE_HIST
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND COLLECTDATE = :WS_STATS_COLLECT_DATE
END IF;
IF SQLSTATE = '02000' AND WS_FLAG = 0 THEN
CALL DW.SP_INDEX_STATS (:DATABASENAME, :TABLENAME,:Sqlmsg, :sqlerrcde, :sqlstte) ;
.......

IF dbname = 'WORKDB' THEN
SET WS_FLAG = 0;
LOCK ROW FOR ACCESS
SELECT 1
INTO WS_FLAG
FROM
(SELECT sum(currentperm) as CURRENTSPACE
FROM DBC.TABLESIZE
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
) C,
(SELECT Peakperm as STATSSPACE
FROM DBA_DB.TABLESIZE_HIST
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND COLLECTDATE = :ws_stats_collect_date
) P
WHERE ABS(CURRENTSPACE - STATSSPACE) >= STATSSPACE * 0.1 ;
END IF;
IF SQLSTATE = '00000' AND WS_FLAG = 1 THEN
CALL DW.SP_INDEX_STATS (:DATABASENAME, :TABLENAME,:Sqlmsg, :sqlerrcde, :sqlstte) ;

5. 10% Change?

  • This case is to cover any table or join index that had a 10% size change.
  • A TableSize History (PPI) table will need to be created and a daily process will need to load the data from DBC.TableSize into this historical table.
    • If there is a 10% change in size between CurrentPerm from DBC.TableSize and CurrentPerm from TableSize_Hist of the specific table or join index, then collect statistics.
    • Please notice that the logic is considering an increase or a decrease of 10% in size.

(32-bit & 64-bit versions)

LOCK ROW FOR ACCESS
SELECT 1
INTO WS_FLAG
FROM (SELECT sum(currentperm) as CURRENTSPACE
FROM DBC.TABLESIZE
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
) C,
(SELECT Currentperm as STATSSPACE
FROM DBA_DB.TABLESIZE_HIST
WHERE DATABASENAME = :dbname
AND TABLENAME = :tblname
AND COLLECTDATE = :ws_stats_collect_date
) P
WHERE ABS(CURRENTSPACE - STATSSPACE) >= STATSSPACE * 0.1 ;
IF SQLSTATE = '00000' AND WS_FLAG = 1 THEN
CALL DW.SP_INDEX_STATS (:DATABASENAME, :TABLENAME,:Sqlmsg, :sqlerrcde, :sqlstte) ;

The SP_INDEX_STATS contains the following steps:

 

Let me provide more details on the SP_INDEX_STATS steps. The SP_INDEX_STATS logic was developed for a 32-bit system but I will provide a 32-bit and 64-bit versions. The 64-bit version needs to be tested because it was never used before. The 32-bit version was used on V2R6.2 and TD12.

  1. Generate Collect Statistics Statements from DBC.IndexStats view:
  • This view contains multi-column index statistics.
  • One row per index column for a multi-column index statistic. Therefore, the process needs to loop to concatenate all the columns of an index before generating the collect statistics statement.
  • Row with ColumnPosition = 1 has the IndexStatistics populated with the statistical information associated with an index.

     
  1. Generate Collect Statistics Statements from DBC.MultiColumnStats view:
  • This view contains multi-column and column PARTITION statistics.
  • One row per column for a multi-column statistic and one row for column PARTITION statistic. Therefore, the process needs to loop to concatenate all the columns of the multi-column statistic before generating the collect statistics statement.
  • Row with ColumnPosition = 1 has the ColumnsStatistics populated with the statistical information associated with a multi-column or a column PARTITION statistics.
  • The code to generate collect statsistics statements from DBC.IndexStats and DBC.MultiColumnStats is the same except for the column name that contains the statistical information IndexStatistics and ColumnsStatistics.  Therefore, just one code example will be presented.
  • Please notice the code is extracting collect date, collect time and unique values from the IndexStatistics and ColumnsStatistics fields.

(32-bit version)

LOCK ROW FOR ACCESS
SELECT 'COLLECT STATISTICS ON '||TRIM(DATABASENAME)||'.'||TRIM(TABLENAME)||' INDEX ( '||:colstat||');',
(HASHBUCKET(SUBSTR(STATS, 2, 1) || SUBSTR(STATS, 1, 1) (BYTE(4)) ) / TD12 - 1900 ) * 10000
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 3, 1) (BYTE(4)) ) ) / TD12 * 100
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 4, 1) (BYTE(4)) ) ) / TD12 (DATE, FORMAT 'YYYY-MM-DD') AS CollectDate,
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 5, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 6, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 7, 1) AS BYTE(4))) / TD12 (FORMAT '99.') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 8, 1) AS BYTE(4))) / TD12 (FORMAT '99')) (TIME(2), FORMAT 'hh:mi:ss.s(2)') AS CollectTime,
CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0
ELSE (-1**(NumValuesw1 / 32768))
* (2**((NumValuesw1/16 MOD 2048) - 1023))
* (1 + ((NumValuesw1 MOD 16) * 2**-4)
+ (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52))
END (DECIMAL(18,0)) AS NumValues,
INTO statmnt, collectdt, collecttm, numvalues
FROM
(SELECT DATABASENAME, TABLENAME
,CASE WHEN SUBSTR(STATS, 23, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(INDEXSTATISTICS, 1, 120) AS Stats
FROM DBC.IndexStats
WHERE DatabaseName = :dbname
AND TableName = :tblname
AND IndexType IN ( 'S' , 'V' )
AND IndexNumber = indxnum
AND ColumnPosition = 1
AND IndexStatistics IS NOT NULL ) S2
GROUP BY 1,2,3,4,5;
IF SQLSTATE = '00000' THEN
INSERT INTO DW.GT_INDEX_STATS
VALUES ( :dbname, :tblname, :indxnum, 'CS', UPPER(:statmnt), :collectdt, :collecttm, :numvalues);
END IF;

(64-bit version)

LOCK ROW FOR ACCESS
SELECT 'COLLECT STATISTICS ON '||TRIM(DATABASENAME)||'.'||TRIM(TABLENAME)||' INDEX ( '||:colstat||');',
(HASHBUCKET(SUBSTR(STATS, 2, 1) || SUBSTR(STATS, 1, 1) (BYTE(4)) ) / TD12 - 1900 ) * 10000
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 3, 1) (BYTE(4)) ) ) / TD12 * 100
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 4, 1) (BYTE(4)) ) ) / TD12 (DATE, FORMAT 'YYYY-MM-DD') AS CollectDate,
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 5, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 6, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 7, 1) AS BYTE(4))) / TD12 (FORMAT '99.') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 8, 1) AS BYTE(4))) / TD12 (FORMAT '99')) (TIME(2), FORMAT 'hh:mi:ss.s(2)') AS CollectTime,
CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0
ELSE (-1**(NumValuesw1 / 32768))
* (2**((NumValuesw1/16 MOD 2048) - 1023))
* (1 + ((NumValuesw1 MOD 16) * 2**-4)
+ (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52))
END (DECIMAL(18,0)) AS NumValues,
INTO statmnt, collectdt, collecttm, numvalues
FROM
(SELECT DATABASENAME, TABLENAME
,CASE WHEN SUBSTR(STATS, 27, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34+14 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(INDEXSTATISTICS, 1, 130) AS Stats
FROM DBC.IndexStats
WHERE DatabaseName = :dbname
AND TableName = :tblname
AND IndexType IN ( 'S' , 'V' )
AND IndexNumber = indxnum
AND ColumnPosition = 1
AND IndexStatistics IS NOT NULL ) S2
GROUP BY 1,2,3,4,5;
IF SQLSTATE = '00000' THEN
INSERT INTO DW.GT_INDEX_STATS
VALUES ( :dbname, :tblname, :indxnum, 'CS', UPPER(:statmnt), :collectdt, :collecttm, :numvalues);
END IF;

  1. Generate Collect Statistics Statements from DBC.ColumnStats view:
  • This view contains single column and single column index statistics.
  • FieldStatistics is not null when the statistics on the column or index is collected.
  • Please notice the code is extracting collect date, collect time and unique values from the FieldStatistics field.

(32-bit version)

LOCK TABLE DBC.ColumnStats FOR ACCESS
INSERT INTO DW.GT_INDEX_STATS
SELECT TRIM(DATABASENAME),TRIM(TABLENAME),0,'CS'
,'COLLECT STATISTICS ON '||UPPER(TRIM(DATABASENAME))||'.'||UPPER(TRIM(TABLENAME))||' COLUMN '||UPPER(TRIM(COLUMNNAME))||';'
, (HASHBUCKET(SUBSTR(STATS, 2, 1) || SUBSTR(STATS, 1, 1) (BYTE(4)) ) / TD12 - 1900 ) * 10000
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 3, 1) (BYTE(4)) ) ) / TD12 * 100
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 4, 1) (BYTE(4)) ) ) / TD12 (DATE, FORMAT 'YYYY-MM-DD') AS CollectDate
,(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 5, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 6, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 7, 1) AS BYTE(4))) / TD12 (FORMAT '99.') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 8, 1) AS BYTE(4))) / TD12 (FORMAT '99')) (TIME(2), FORMAT 'hh:mi:ss.s(2)') AS CollectTime
,CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0
ELSE (-1**(NumValuesw1 / 32768))
* (2**((NumValuesw1/16 MOD 2048) - 1023))
* (1 + ((NumValuesw1 MOD 16) * 2**-4)
+ (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52))
END (DECIMAL(18,0)) AS NumValues
FROM
(SELECT DATABASENAME,TABLENAME,COLUMNNAME
,CASE WHEN SUBSTR(STATS, 23, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(FieldStatistics, 1, 120) AS Stats
FROM DBC.ColumnStats
WHERE DatabaseName = :dbname
AND TableName = :tblname
AND FieldStatistics IS NOT NULL ) S2
GROUP BY 1,2,3,4,5,6,7,8 ;

(64-bit version)

LOCK TABLE DBC.ColumnStats FOR ACCESS
INSERT INTO DW.GT_INDEX_STATS
SELECT TRIM(DATABASENAME),TRIM(TABLENAME),0,'CS'
,'COLLECT STATISTICS ON '||UPPER(TRIM(DATABASENAME))||'.'||UPPER(TRIM(TABLENAME))||' COLUMN '||UPPER(TRIM(COLUMNNAME))||';'
, (HASHBUCKET(SUBSTR(STATS, 2, 1) || SUBSTR(STATS, 1, 1) (BYTE(4)) ) / TD12 - 1900 ) * 10000
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 3, 1) (BYTE(4)) ) ) / TD12 * 100
+ (HASHBUCKET('00'XB || SUBSTR(STATS, 4, 1) (BYTE(4)) ) ) / TD12 (DATE, FORMAT 'YYYY-MM-DD') AS CollectDate
,(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 5, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 6, 1) AS BYTE(4))) / TD12 (FORMAT '99:') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 7, 1) AS BYTE(4))) / TD12 (FORMAT '99.') ) ||
(HASHBUCKET(CAST('00'xb || SUBSTR(STATS, 8, 1) AS BYTE(4))) / TD12 (FORMAT '99')) (TIME(2), FORMAT 'hh:mi:ss.s(2)') AS CollectTime
,CASE WHEN SUBSTR(STATS, 49+Offset+0, 8) = '00'xb THEN 0
ELSE (-1**(NumValuesw1 / 32768))
* (2**((NumValuesw1/16 MOD 2048) - 1023))
* (1 + ((NumValuesw1 MOD 16) * 2**-4)
+ (NumValuesw2 * 2**-20) + (NumValuesw3 * 2**-36) + (NumValuesw4 * 2**-52))
END (DECIMAL(18,0)) AS NumValues
FROM
(SELECT DATABASENAME,TABLENAME,COLUMNNAME
,CASE WHEN SUBSTR(STATS, 27, 1) = '00'XB THEN 16 ELSE 0 END + V3 AS Offset
,CASE WHEN StatsVersion = 3 THEN 34+14 ELSE 0 END AS V3
,HASHBUCKET (SUBSTR(STATS, 9+1, 1) || SUBSTR(STATS, 9+0, 1) (BYTE(4))) / TD12 AS StatsVersion
,((HASHBUCKET()+1)/65536) AS TD12
,HASHBUCKET(SUBSTR(STATS, 49+Offset+7, 1) || SUBSTR(STATS, 49+Offset+6, 1) (BYTE(4))) / TD12 AS NumValuesw1
,HASHBUCKET(SUBSTR(STATS, 49+Offset+5, 1) || SUBSTR(STATS, 49+Offset+4, 1) (BYTE(4))) / TD12 AS NumValuesw2
,HASHBUCKET(SUBSTR(STATS, 49+Offset+3, 1) || SUBSTR(STATS, 49+Offset+2, 1) (BYTE(4))) / TD12 AS NumValuesw3
,HASHBUCKET(SUBSTR(STATS, 49+Offset+1, 1) || SUBSTR(STATS, 49+Offset+0, 1) (BYTE(4))) / TD12 AS NumValuesw4
,SUBSTR(FieldStatistics, 1, 130) AS Stats
FROM DBC.ColumnStats
WHERE DatabaseName = :dbname
AND TableName = :tblname
AND FieldStatistics IS NOT NULL ) S2
GROUP BY 1,2,3,4,5,6,7,8 ;

  1. The final step is to run the collect statistics statements for all columns, multi-columns and indexes.

Finally...

Before I finish, let’s talk about a special statistics collection process to cover DBC tables, PPI tables and and statistics associated with critical load processes and reports.

  1. Collect Statistics on DBC tables. The DBC tables do not have any statistics defined and here are some statistics collection recommendations based on best practices.

  2. COLLECT STATISTICS ON DBC.ACCESSRIGHTS COLUMN UserId;
    COLLECT STATISTICS ON DBC.ACCESSRIGHTS COLUMN DatabaseId;
    COLLECT STATISTICS ON DBC.ACCESSRIGHTS COLUMN TVMId;
    COLLECT STATISTICS ON DBC.ACCESSRIGHTS COLUMN FieldId;
    COLLECT STATISTICS ON DBC.ACCESSRIGHTS COLUMN GrantorID;
    COLLECT STATISTICS ON DBC.ACCESSRIGHTS COLUMN AccessRight;
    COLLECT STATISTICS ON DBC.ACCESSRIGHTS COLUMN WithGrant;
    COLLECT STATISTICS ON DBC.ACCESSRIGHTS INDEX (UserId,DatabaseId);

    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN UserID;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN DatabaseID;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN TVMId;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN ColumnId;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCheckPoint;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateDataBase;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateMacro;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateTable;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateUser;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateView;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDelete;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropDatabase;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropMacro;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropTable;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropUser;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropView;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDump;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrExecute;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrGrant;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrInsert;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrIndex;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrReference;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrRestore;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrSelect;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrUpdate;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateTrigger;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropTrigger;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateProcedure;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropProcedure;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrExecuteProcedure;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateRole;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropRole;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateProfile;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropProfile;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrCreateFunction;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrDropFunction;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrAlterFunction;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrExecuteFunction;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrAlterProcedure;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl COLUMN AcrRepControl;
    COLLECT STATISTICS ON DBC.AccLogRuleTbl INDEX (UserID,DatabaseID,TVMId,ColumnId);

    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN LogonDate;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN LogonTime;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN SessionNo;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN UserId;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN UserName;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN AccountName;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN AccessType;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN OwnerName;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN DatabaseName;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN TVMName;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN ColumnName;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN ObjectId;
    COLLECT STATISTICS ON DBC.AccLogTbl COLUMN ColumnId;
    COLLECT STATISTICS ON DBC.AccLogTbl INDEX (LogonDate,LogonTime);

    COLLECT STATISTICS ON DBC.Accounts COLUMN UserId;
    COLLECT STATISTICS ON DBC.Accounts COLUMN AccountName;
    COLLECT STATISTICS ON DBC.Accounts COLUMN RowType;

    COLLECT STATISTICS ON DBC.DBASE COLUMN DATABASENAMEI;
    COLLECT STATISTICS ON DBC.DBASE COLUMN DATABASEID;
    COLLECT STATISTICS ON DBC.DBASE COLUMN DATABASENAME;
    COLLECT STATISTICS ON DBC.DBASE COLUMN CommentString;
    COLLECT STATISTICS ON DBC.DBASE COLUMN AccountName;
    COLLECT STATISTICS ON DBC.DBASE COLUMN CreatorName;
    COLLECT STATISTICS ON DBC.DBASE COLUMN RoleName;
    COLLECT STATISTICS ON DBC.DBASE COLUMN ProfileName;
    COLLECT STATISTICS ON DBC.DBASE COLUMN OwnerName;
    COLLECT STATISTICS ON DBC.DBASE COLUMN RowType;

    COLLECT STATISTICS ON DBC.INDEXES COLUMN TABLEID;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN IndexType;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN IndexNumber;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN UniqueFlag;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN FIELDID;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN FieldPosition;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN DatabaseId;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN Name;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN CREATEUID;
    COLLECT STATISTICS ON DBC.INDEXES COLUMN LASTALTERUID;

    COLLECT STATISTICS ON DBC.Owners COLUMN OwnerId;
    COLLECT STATISTICS ON DBC.Owners COLUMN OwneeId;

    COLLECT STATISTICS ON DBC.Parents COLUMN DbId;
    COLLECT STATISTICS ON DBC.Parents COLUMN ParentId;

    COLLECT STATISTICS ON DBC.Profiles COLUMN ProfileNameI;
    COLLECT STATISTICS ON DBC.Profiles COLUMN ProfileId;
    COLLECT STATISTICS ON DBC.Profiles COLUMN ProfileName;
    COLLECT STATISTICS ON DBC.Profiles COLUMN DefaultAccount;

    COLLECT STATISTICS ON DBC.RoleGrants COLUMN GranteeId;
    COLLECT STATISTICS ON DBC.RoleGrants COLUMN GranteeKind;
    COLLECT STATISTICS ON DBC.RoleGrants COLUMN RoleId;
    COLLECT STATISTICS ON DBC.RoleGrants COLUMN WithAdmin;
    COLLECT STATISTICS ON DBC.RoleGrants COLUMN GrantorId;
    COLLECT STATISTICS ON DBC.RoleGrants INDEX (GranteeId,RoleId);

    COLLECT STATISTICS ON DBC.ROLES COLUMN ROLENAME;
    COLLECT STATISTICS ON DBC.ROLES COLUMN ROLENAMEI;
    COLLECT STATISTICS ON DBC.ROLES COLUMN ROLEID;

    COLLECT STATISTICS ON DBC.TriggersTbl COLUMN TableId;
    COLLECT STATISTICS ON DBC.TriggersTbl COLUMN TriggerId;
    COLLECT STATISTICS ON DBC.TriggersTbl COLUMN DatabaseID;
    COLLECT STATISTICS ON DBC.TriggersTbl COLUMN TriggerName;

    COLLECT STATISTICS ON DBC.TVFIELDS COLUMN TABLEID;
    COLLECT STATISTICS ON DBC.TVFIELDS COLUMN FIELDID;
    COLLECT STATISTICS ON DBC.TVFIELDS COLUMN FieldName;
    COLLECT STATISTICS ON DBC.TVFIELDS COLUMN FieldType;
    COLLECT STATISTICS ON DBC.TVFIELDS COLUMN CREATEUID;
    COLLECT STATISTICS ON DBC.TVFIELDS COLUMN LASTALTERUID;
    COLLECT STATISTICS ON DBC.TVFIELDS INDEX (TABLEID,FIELDID);

    COLLECT STATISTICS ON DBC.TVM COLUMN DATABASEID;
    COLLECT STATISTICS ON DBC.TVM COLUMN TVMNameI;
    COLLECT STATISTICS ON DBC.TVM COLUMN TVMId;
    COLLECT STATISTICS ON DBC.TVM COLUMN TVMName;
    COLLECT STATISTICS ON DBC.TVM COLUMN TABLEKIND;
    COLLECT STATISTICS ON DBC.TVM COLUMN PROTECTIONTYPE;
    COLLECT STATISTICS ON DBC.TVM COLUMN CREATORNAME;
    COLLECT STATISTICS ON DBC.TVM COLUMN LASTALTERUID;
    COLLECT STATISTICS ON DBC.TVM INDEX (DATABASEID,TVMNAMEI);
  3. Collect Statistics on tables that have no statistics defined on indexes. Just check if any index has the statistics field equal to null.
  4. Collect Statistics on Partition columns for very large PPI tables. But consider the new feature statistics extrapolation in TD12 and also consider adding a NUSI on the partition column to reduce the impact of the collect statistics process.
  5. Collect Statistics daily on column PARTITION for very large PPI tables. The collect statistics on column PARTITION is very efficient even in very large tables because it reads the cylinder headers to determine the partitions.
  6. Collect Statistics daily on NUSI’s associated with critical load processes and reports. The collect statistics on a secondary index is very efficient even in very large tables because it reads the index sub-table instead of the base table.
10 Comments
Teradata Employee
Hi Marcio,

Thanks for sharing all this detail, this is good stuff!

I have a question about your data dictionary statistics.

When you look at that list of statistics that you collect on the dictionary, it could be intimidating, because it's a lot of additional stats beyond the user table collections. But I'm guessing that they don't use that much CPU and I/O and that they don't take too long to collect, because dictionary tables are often small compared to user tables.

Do you have any metrics you could share with us as to how long the dictionary stats take to collect?

Hi Carrie,

it usually takes less than a minute to refresh the dictionary stats in my production environment and I am refreshing at the table level. Because of that I refresh on a daily basis. So, you are correct, the impact of refreshing the dictionary stats is very minimum.
Hi Marcio,
This article is good. On 10% change of the table size for recollecting stats........... how do you account for compression and Updates(thats chnages values but not size)?

Thanks
Bhupal
Enthusiast
Marcio

Thanks for the information - there is a recommendation to collect statistcis on PARTITION for all tables, regardless of whetehr they are PPI or not. Your Part 1 shows the syntax and the explain (which highlights DBC.Indexes is used), but there is nothing in Part 2 about how you check whether PARTITION has stats collected on it.

I have run a test and can see no difference in IndexStats before/after running a COLLECT STATS ON DBNAME.TBLNAME COLUMN PARTITION.

Is it buried in the IndexStatistics field? Or is there an easy way to extract this information and then collect the statistics on it.
It is possible to have changes on the overall size of a table due to updates because of cylinder splits and the process will recollect stats after a compression being applied and the savings is more than 10%.
The Partition Stats is stored on DBC.MultiColumnStats
The dictionary tables usually takes less than five minutes to refresh stats.
Teradata Employee
Marcio - This is very good stuff. Do you have any updates for the 64 bit version of this? I have been trying the zero stats sql out on a new system and 64 bit version is not producing
the expected results. Thanks
Junior Contributor
TD12.0.3.1/TD13.0.0.21 introduced a new version 4 for the binary statistics.

You'll find the latest version of my stats query (which those calculations are based on) at http://www.teradataforum.com/attach.htm

This should fix your problems.

Dieter
Teradata Employee
Thanks - that works!
Hi Marcio,

Thanks for the article. I was using all these procs for TD12.

How to implement same stuff for TD14 as new stats table is introduced in TD14?

Thanks
Pariksheet
Junior Supporter

Hi Marico,

We have a table with PPI on Col1 , which is a date column. We have a query running uses condition  

Col1 >= DATE '2013-07-06'

AND Col1<= DATE '2013-07-08'

There was no data for date = 2013-07-06 and 2013-07-07 (as no data comes on weekend).

On monday 2013-07-08, we have 4 million records. Data for 2013-07-05 was 3.5 million.

We have a weekly stats stragegy i.e stats was taken on this table (and we have a stats defined on Col1 and partition as well) on 2013-07-07. 

But, if you see the explain plan for the query, for the above mentioned condition, it assumes 1 row , with high confidence. Now, the query goes for a toss and doesn't complete.

When i collect stats on Col1 again today, the query runs in 7 minutes. 

If you see help stats on 7th - unique values were -3152 and today after stats it is 3153 (just 2013-07-08 is added , i suppose). I am unable to understand why is it not able to use the weekend stats as the data has not changed much. This table is 1 TB table and 4 million is not a huge addition for a day.

So, do we need to collect this column stats on a daily basis as my query runs daily (is this reconended ?).On a broader perspective, what is the recomendation for daily stats collection. We are facing problem of this table only, rest all tables are fine (with only weekly stats). Could there be any other reason ?

-Thanks !