Why is this statistic recommended?

Database
Enthusiast

Why is this statistic recommended?

Hello fellow Teradata-ers,

Our Teradata system was recently upgraded to Teradata 14.10 and we have been working on tuning our queries to perform a little bit better.  In doing so, we are using the DIAGNOSTICS HELPSTATS ON FOR SESSION function to see what we are missing.  However, I do not understand why some of the recommendations will help the optimizer?  

Consider this example:

TableAlpha with columns A, B, Aa, Ab, Ac, and Ad, partitioned on A, UPI is (A, B, Aa), total size about 1M records.  Stats collected: COLUMN PARTITION, UPI, COLUMNS (A,B).  Skew is ~ 1%. 

TableBeta with columns A,  B, Ba, Bb, C, Bc, Bd, Be, Bf, Bg, partitioned on A, UPI is (A, B, Ba, Bb), total size about 150M records. Stats collected: COLUMN PARTITION, UPI, COLUMNS (A,B), COLUMNS (A,B,C).   Skew is ~ 1%. 

TableGamma with columns C, Ca, Cb, Cc, partitioned on C, UPI is (C, Ca), total size about 25 records.  Skew is high, due to small record count. 

The query that we run has a WHERE clause "WHERE TableAlpha.A = value" and INNER JOINs on TableAlpha.A = TableBeta.A and TableAlpha.B = TableBeta.B and TableGamma.C = TableBeta.C.  The query runs ok, but we think it can be faster.  

The HELPSTATS recommends getting additional statistics for this query on TableAlpha Column A, TableAlpha Column B, TableBeta Column A,  and TableBeta Column B (separately), among many other recommednations, most with high confidence.  If the join is on both columns together and the join is an inner join, why does the optimizer think that these will be helpful?  If it were an outer join, would having the separate stats make a difference?  

Many thanks in advance.  

- Joe

2 REPLIES
Senior Apprentice

Re: Why is this statistic recommended?

Hi Joe,

this might be due to better row estimations, is it a 1:many or many:many join.

Based on you narration you're not joining on the UPI, which is known to optimizer.

Teradata Employee

Re: Why is this statistic recommended?

If you think there can be a better plan, looking at the explain will tell you a lot more than looking at the diagnostic helpstats. Look at the explain and look for places where the estimates are far off the actulas for qualified or joined results at stages of the plan.