How to force duplication of the rows from a small table on all AMPs

Database
Ambassador

Re: How to force duplication of the rows from a small table on all AMPs

Hi Ajyush,

a NUSI on tablea.col1 will not be used for joins (unless you use a very restrictive WHERE condiition).

Did you check for missing stats (using DIAGNOSTIC HELPSTATS ON FOR SESSION)?

Redistributing might be the most reasonable way, otherwise the 15mio rows might have to be duplicated.

You could also try to rewrite the join to a correlated subquery using EXISTS, but i doubt it will change the plan.

Btw, the "col+0" trick doesn't work anymore (the optimizer is smarter since TD13), but "col*1" or "coalesce(col1,col1)" still works.

Dieter

Enthusiast

Re: How to force duplication of the rows from a small table on all AMPs

Hi Dieter,

This is my first post and I am have been following you for a while before creating an account.  Thanks for your valuable suggestions.

Request you to clarify me on the below.

1. Does the join query dependent on number of AMP's because when the same query ran on less number of AMP system, the small table has been redistributed, but when it is ran on large AMP system, the larger table has been redistributed.

If not, then what forced teradata to distribute different tables in both scenarios.

2.  As long as tableb has the UPI on the join column, it won't be redistributed!

As, Ayush mentioned that small tableb with UPI has been distributed across all AMP's eventhough it has an UPI.

How does it happen? (Both are contrasting according to me atleast!)

I hope that redistribution and duplication of small table is entirely different.

3.  You have suggested to use "col*1" or "coalesce(col1,col1)" to help optimizer to ignore the index.

Now, the 2 tables has to be distributed as it can't use the index of both the tables right? which will even degrade the performance.

Please enlighten me on the above.  Excuse me for any silly question.

Thanks,

Vikram.

Ambassador

Re: How to force duplication of the rows from a small table on all AMPs

Hi Vikram,

#1: the optimizer knows about the system (number of AMPs, CPU speed, etc.) and this is included in planning, so yes, it's dependent on the number of AMPs

#2: There are many different ways to prepare data for a join, the optimizer will calculate which is the least expensive one.

#3: Of course it might degrade performance (and it will in probably most cases), it's just to check hwo the plan will look like when the PI is changed

You'll find lots of info about planning/optimization in the "SQL Request and Transaction Processing" manual.

Btw, those questions are not silly :-)

Dieter

Not applicable

Re: How to force duplication of the rows from a small table on all AMPs

Thank you Dieter!!