Collected statistics are a valuable source of information in Teradata, not only for the optimizer, but also for developers or DBAs. Spotting outdated statistics (which might lead to bad optimizer plans) and implementing/monitoring a stats recollection process are common tasks, which need the current statistics as input.
The HELP STATS function returns information about stats on a table or column level, but there's no builtin way to extract info about all stats on all tables in a single query. The DBC views ColumnsStatistics, MultiColumnStats, IndexStats expose the stats data, but those statistics are stored in a binary format.
This internal format is documented in the "SQL Request and Transaction Processing" manual, although the exact details and the differences between 32 and 64bit are somewhat vague. However utilizing that knowledge to decode the binary stats is limited by a restriction in Teradata SQL: BYTEs can't be casted to any other type. You know those 8 bytes represent a float, but you can't convert them to a readable value. Therefore the most important step to implement my stats query was discovering how to cast bytes to numeric values using the HASHBUCKET function.
I started to develop a query to decode Teradata's binary stats way back in 2003 on V2R4, over the following years I enhanced and fixed it several times (with a lot of Trial & Error) to accomplish new internal stats versions and the changes from 32 to 64bit. This old stats query is available in the attachment area of John Hall's TeradataForum (btw still the best independent source of Teradata knowledge) and is used at many customer sites.
The latest release posted here extracts even more details without much additional overhead and now includes info about an index or a column from the table's partitioning expression without collected statistics. This StatsInfo view provides following columns, an asterisk "*" indicates information similar to HELP STATS output and "+" an enhanced or new column:
DatabaseName 
"ColumnName" is a commaseparated list of the names of the columns for which statistics are reported. To remove those double quotes do a global Search & Replace within the source code replacing '"' with '', but this will also remove the double quotes from "CollectStatement" and "HelpStatement" 

TableName 

*  ColumnName 

+ 
IndexName  Name of the index, if it exists 
*  ColumnCount 
The number of columns in the stats or index definition. The "FieldCount" of HELP STATS is not exactly the same, it might be less, as this is the number of columns actually stored in the stats. By default only 16 columns are supported although Teradata allows up to 64. Hopefully there are no indexes or stats with more than 16 columns, but additional columns are already in the source code, just commented out. Search for "uncomment" when you need to modify it. 
+  MissingStats 
Indicates an index or a column from the table's partitioning expression without collected statistics: Y/N 
+  DateIncluded  "ColumnName" includes a DATE or TIMESTAMP: Y/N 
+  PartitionColumn 
Stats include a column which is used in the table's partitioning expression: Y/N 
+  PartitionLevels 
Number of levels in the table's partitioning expression, zero means not partitioned 
+  PartitionsDefined 
The number of partitions defined within that table Only calculated for single level partitioning and TD12, previous releases will always show 65535 instead of the correct number. Might be 65535 on current releases, too, due to (yet) unknown reasons sometimes the TableCheck column in dbc.IndexConstraints isn't created correctly. 
+  TableType 
TempTbl > Global Temporary Table 
+  StatsType 
Stats collected on: UPI > Unique Primary Index 
*  CollectDate 
Date and time when statistics were last collected 
*  CollectTime 

CollectTimestamp 

CollectDuration 
How long did the Collect Stats run. This is not always correct as the calculation is based on the LastAlterTimestamp, which is not only set when stats are collected, but also during a ALTER TABLE modifying the column attributes (this should be a rare case, but can't be fixed). Additionally on some systems there's a fixed deviation (due to some unknown reason, Time zone settings on TD vs. OS level?), which can easily be spotted when comparing "LastAlterTimestamp" and "CollectTimestamp". Search for "TimeAdjust" to add or subtract that fixed value. Old statistics might generate another deviation when the system time zone was changed since collecting. Recollecting those stats should fix it. 

*  NumericStats 
Indicates single column stats on a numeric or date column: Y/N 
*  SampleSize 
Sample size used for collect stats, NULL if not sampled 
*  StatsVersion 
Internal version of statistics: 1 > preV2R5 
*  NumAMPs  The number of AMPs from which statistics were collected, usually the number of AMPs in the system, 1 for an empty table 
*  NumIntervals 
The number of intervals in the frequency distribution histogram containing the column statistics. 
*  AvgAmpRPV 
Overall average of the average rows per value from each AMP, only for NUSIs, otherwise zero 
*  OneAMPSampleEst 
The estimated cardinality of the table based on a singleAMP sample 
*  AllAmpSampleEst 
The estimated cardinality of the table based on an allAMP sample 
*  NumNulls 
The number of rows containing NULLs for one or more columns in "ColumnName" 
*  NumAllNulls 
The number of rows containing nulls for all columns in "ColumnName" 
*  NumRows 
The cardinality of the table, i.e. the number of rows 
*  NumValues 
The number of unique values 
*  ModeFreq 
The frequency of the most common value 
*+  ModalValue 
The most common value, only for single column numeric or date stats 
*+  MinValue 
The minimum value, only for single column numeric or date stats 
*+  MaxValue 
The maximum value, only for single column numeric or date stats 
+  DataSize 
The size in bytes needed to store the statistics. If greater than 16 only partial data is stored. For single column stats the actual size for any numeric column is 8, as it's always stored as float. 
+  FieldType 
The datayte of a single column stats, NULL for multiple columns 
+  LastAlterTimestamp 
The last time when statistics have been collected (or this column/index was modified by an ALTER TABLE) 
+  LastAccessTimestamp 
The last time and the number of times this column/index was used in queries, the same info is found in dbc.TablesV and dbc.Indices Of course this is only available when it has been enabled via dbscontrol ObjectUseCountCollectRate 
+  AccessCount 

+  TableId 
To facilitate additional joins to other system tables 
+  IndexNumber 
From dbc.Indexes  1..128: Internal index number, >128: pseudo index = multicolumn stats 
+  FieldId 
From dbc.TVFields  NULL for multicolumn or PARTITION stats 
+  CollectStatement 
COLLECT STATS statement to (re)collect the stats I got several request for syntax changes: Finally i included all four possible variations in the source code, now it's your choice, just uncomment the preferred syntax: 
+  HelpStatement 
HELP STATS statement to get the stats details on column or index level 
TD14 stores statistics in a more readable format in a system table (dbc.StatsTbl already exists in TD13, but wasn't populated). The new dbc.StatsV returns much of the information i extracted here, but i like to get more details, so i wrote a new version for TD14
Please report any issues as this version has not been tested that much, i.e. you're my beta testers :)
I'm open for enhancement requests, too, but I don't think there's much to add to this query without joining additional tables. You can do that on your own, that's why there's the TableId/FieldId/IndexNumber column.
20111228:
Already fixed the first bug, zero PartitionLevels for PARTITION stats.
Added PartitionsDefined
20111231:
Fixed 'Invalid Date' error for empty tables
20120201:
Added IndexName
Additional syntax variations for "CollectStatement"
20121003:
Fixed wrong calculation of ColumnCount/DataSize/DateIncluded for some stats involving PARTITION
Added IndexNumber and FieldId
20121023
Fixed a wrong calculation of PartitionColumn when a double quoted ColumnName is used in the partition definition
20121124
Fixed: Wrong calculation of PartitionColumn when a ColumnName is used in different upper/lower case in the partitioning expression (dbc.TableConstrains.TableCheck is defined as CASESPECIFIC)
Attached files:
teradata_statistics.pdf  A short description of the binary format 
stats_32bit.sql  For 32bit systems: MPRAS and Teradata Express for Windows 
stats_64bit.sql  For 64bit systems: SLES, Windows Server and Teradata Express VMWare 
stats_64bit_udf.sql  For 64bit systems running TD13.10: SLES, Windows Server and Teradata Express VMWare Replaces the ugly HASHBUCKET calculations with SQL UDFs, the source code looks much cleaner now Needs a Search & Replace to change '<dbname>' with a valid database name: 
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.