Avoiding Duplication on All Amps

Database
Enthusiast

Avoiding Duplication on All Amps

Hi

I'm trying to join two tables, Left table "TableA" being (30 GB and 90 milion rows) with the another table "tableB" ( 2.5 GB and 21 million rows). 

Following is the sql

Select * from  TableA t1 left join TableB t2

on t1.ColumnX=t2.columnX

and T1.ColumnX is not null;

DDLs:

TableA

CREATE SET TABLE TableA

(

column1 Integer Not null,

column2 BigInt Not null,

ColumnX Integer compress,

column3 VArchar(100),

column4 Varchar(200),

.

.

Column120 Integer)

Unique Primary Index(Column1,Column2)

Index(Column1);

TableB:

CREATE SET TABLE TableB

(

ColumnX Integer Not NULL,

Column1 varchar(100),

Column2 varchar(50),

.

.

Column30 VArchar(50))

Unique Primary Index (ColumnX);

Problem:

The TableB is always duplicated across ALL AMps which is causing my code to run for long time and lot of CPU seconds. IS there Any way we can avoid this Duplication.

2 REPLIES
Supporter

Re: Avoiding Duplication on All Amps

Do you have stats collected on ColumnX on both tables?

Teradata Employee

Re: Avoiding Duplication on All Amps

The alternative plan is to redistribute TableA which appears to be a fairly wide table, lots of bytes to redistribute. Are there lots of nulls in columnx in TableA? Are there lots of duplicate values in columnX in TableA? 

The rest of the plan is not provided and evidence is of provided to show that the duplication is the costly part of the plan. Since columnx is not unique in TableA, a product join needs to be done within value of columnx in tableB, this join step is probably the costly one not the redistribution.