Statistical information is vital for the optimizer when it builds query plans. But collecting statistics can involve time and resources. By understanding and combining several different statistics gathering techniques, users of Teradata can find the correct balance between good query plans and the time required to ensure adequate statistical information is always available.
This compilation of statistics collection recommendations are intended for sites that are on any of the Teradata 14.10, 15.0, 15.10, 16.0 and 16.10 software release levels. Some of these recommendations apply to releases earlier than Teradata 14.10 and some rely on new features available only in Teradata 14.10. Statistics collection functionality in the Teradata Database basically works the same in the 14.10 through the 16.10 releases.
For greater detail on collecting statistics for Teradata Database, see the orange book titled: Teradata Database 14.10 Statistics Enhancements by Rama Krishna Korlapati.
Contributors: Carrie Ballinger, Sung-Jin Kim, Paul Sinclair.
Collect Full Statistics
Can Rely on Dynamic AMP Sampling
Collect Multicolumn Statistics
General Suggestions for the Statistics Collection Process
New Recommendations for Teradata 14.10 to Teradata 16.10
EXPLAIN SELECT DISTINCT NUPI-column FROM table;
3. For a partitioned table, it is recommended that you always collect statistics on:
4. For a partitioned primary index table, consider collecting these statistics if the partitioning column is not part of the table’s primary index (PI):
5. Dynamic AMP sampling has the option of pulling samples from all AMPs, rather than from a single AMP (the default). Dynamic all-AMP sampling has these particular advantages:
6. In order to allow the optimizer to consider the Partial Duplication Partial Redistribution optimization, full-length statistics must have been collected on the combined columns of the join constraint of the skewed table. USING MAXVALUELENGTH to cover the entire combined length may be a benefit in such cases.
7. While you can do it, it is not ideal to run an update job against a table that you are collecting stats on, as the stats will not reflect the actual state of the table when the updates are complete. It is recommended to run stats collections after loading is complete, not during loading.
 Any column which is over 95% unique is considered as a nearly-unique column.
 Dynamic AMP sampling is sometimes referred to as random AMP sampling.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.