Collecting table level stats in TD14.10 with ONE pass through the table?

Database
Enthusiast

Collecting table level stats in TD14.10 with ONE pass through the table?

It was my belief that in TD14.10, if you collect table-level stats (e.g. via the command "collect stats on TABLE_ABC;") on a table that already has stats collected on numerous columns, Teradata would recollect all of the existing stats IN A SINGLE PASS THROUGH THE TABLE (for this discussion, let's ignore TD14.10's ablility to SKIP stats collection).  However, after running an explain on a "collect stats on TABLE_ABC;" statement, I'm no longer sure of that.  The explain shows the words "we do an all-AMPs RETRIEVE step from sysdba.TABLE_ABC by way of an all-rows scan" for the first column, but for every subsequent column I see the words "we do an all-AMPs SUM step to aggregate from sysdba.TABLE_ABC by way of an all-rows scan".  Does this mean Teradata is performing a FTS for *every* column requiring stats collection?

Here's a log showing my entire test:

CREATE SET TABLE sysdba.TABLE_ABC

     (

      COL_1 SMALLINT,

      COL_2 SMALLINT,

      COL_3 SMALLINT)

UNIQUE PRIMARY INDEX ( COL_1 );

 *** Table has been created.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

insert into  sysdba.TABLE_ABC  values (1,2,3) ;

 *** Insert completed. One row added.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

insert into  sysdba.TABLE_ABC  values (4,5,6) ;

 *** Insert completed. One row added.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

insert into  sysdba.TABLE_ABC  values (7,8,9) ;

 *** Insert completed. One row added.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

collect stats column (COL_1) on sysdba.TABLE_ABC ;

 *** Update completed. 2 rows changed.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

collect stats column (COL_2) on sysdba.TABLE_ABC ;

 *** Update completed. 2 rows changed.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

collect stats column (COL_3) on sysdba.TABLE_ABC ;

 *** Update completed. 2 rows changed.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

help stats sysdba.TABLE_ABC ;

 *** Help information returned. 4 rows.

 *** Total elapsed time was 1 second.

Date     Time     Unique Values        Column Names

-------- -------- -------------------- -----------------------------------------

15/12/21 12:28:39                    3 *

15/12/21 12:28:39                    3 COL_1

15/12/21 12:28:39                    3 COL_2

15/12/21 12:28:39                    3 COL_3

+---------+---------+---------+---------+---------+---------+---------+---------+

delete from sysdba.TABLE_ABC ;

 *** Delete completed. 3 rows removed.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

insert into  sysdba.TABLE_ABC  values (1,2,3) ;

 *** Insert completed. One row added.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

insert into  sysdba.TABLE_ABC  values (4,5,6) ;

 *** Insert completed. One row added.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

insert into  sysdba.TABLE_ABC  values (7,8,9) ;

 *** Insert completed. One row added.

 *** Total elapsed time was 1 second.

+---------+---------+---------+---------+---------+---------+---------+---------+

explain

collect stats on sysdba.TABLE_ABC ;

 *** Help information returned. 47 rows.

 *** Total elapsed time was 1 second.

Explanation

--------------------------------------------------------------------------------

  1) First, we lock sysdba.TABLE_ABC for access.

  2) Next, we do an all-AMPs RETRIEVE step from sysdba.TABLE_ABC by way

     of an all-rows scan with no residual conditions into Spool 3

     (all_amps), which is built locally on the AMPs.  Then we do a SORT

     to order Spool 3 by the sort key in spool field1 (

     sysdba.TABLE_ABC.COL_1).  The size of Spool 3 is estimated with

     high confidence to be 3 rows (93 bytes).  The estimated time for

     this step is 0.02 seconds.

  3) Then we save the UPDATED STATISTICS for ('COL_1 ') from Spool 3

     (Last Use) into Spool 5, which is built locally on a single AMP

     derived from the hash of the table id.

  4) We do an all-AMPs SUM step to aggregate from sysdba.TABLE_ABC by

     way of an all-rows scan with no residual conditions

     , grouping by field1 ( sysdba.TABLE_ABC.COL_2).  Aggregate

     Intermediate Results are computed globally, then placed in Spool 8.

     The size of Spool 8 is estimated with high confidence to be 3 rows

     (75 bytes).  The estimated time for this step is 0.03 seconds.

  5) Then we save the UPDATED STATISTICS for ('COL_2 ') from Spool 8

     (Last Use) into Spool 10, which is built locally on a single AMP

     derived from the hash of the table id.

  6) We do an all-AMPs SUM step to aggregate from sysdba.TABLE_ABC by

     way of an all-rows scan with no residual conditions

     , grouping by field1 ( sysdba.TABLE_ABC.COL_3).  Aggregate

     Intermediate Results are computed globally, then placed in Spool

     13.  The size of Spool 13 is estimated with high confidence to be

     3 rows (75 bytes).  The estimated time for this step is 0.03

     seconds.

  7) Then we save the UPDATED STATISTICS for ('COL_3 ') from Spool 13

     (Last Use) into Spool 15, which is built locally on a single AMP

     derived from the hash of the table id.

  8) We compute the table-level summary statistics from spool 15 and

     save them into Spool 16, which is built locally on a single AMP

     derived from the hash of the table id.

  9) We lock DBC.StatsTbl for write on a RowHash.

 10) We do a Single AMP MERGE Update to DBC.StatsTbl from Spool 5 (Last

     Use) by way of a RowHash match scan.

 11) We do a Single AMP MERGE Update to DBC.StatsTbl from Spool 10

     (Last Use) by way of a RowHash match scan.

 12) We do a Single AMP MERGE Update to DBC.StatsTbl from Spool 15

     (Last Use) by way of a RowHash match scan.

 13) We do a Single AMP MERGE Update to DBC.StatsTbl from Spool 16

     (Last Use) by way of a RowHash match scan.

 14) We spoil the statistics cache for the table, view or query.

 15) We spoil the parser's dictionary cache for the table.

 16) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> No rows are returned to the user as the result of statement 1.

+---------+---------+---------+---------+---------+---------+---------+---------+

Tags (1)
1 REPLY
Teradata Employee

Re: Collecting table level stats in TD14.10 with ONE pass through the table?

The optimizer will attempt to re-use spool files for additional statistics, and may change the order in which stats are collected to improve re-use.  But no, it does not guarantee stats can be done with a "single pass" of the table.