All amp duplication issue

Database

All amp duplication issue

Hi All,

We have a big table/view with a lot of data 133184k rows and we have  a smaller calender table which has 3k data ,Now when we join them we get all amp duplication issue and get 32k amp cpu time.

I know it is very straight forward and not sure of a way to tune the query ,Please let me know if there is anything that can be done to reduce cpu cycles of the below query.

SELECT a,

c , b , cal.d ,

e (NAMED DAY_NO ) ,

CASE WHEN cal.f = 1

THEN PTP_1 WHEN cal.f= 2

THEN PTP_2 WHEN cal.f = 3

THEN PTP_3 WHEN cal.f = 4

THEN PTP_4 WHEN cal.f = 5

THEN PTP_5 WHEN cal.f = 6

THEN PTP_6 WHEN cal.f = 7

THEN PTP_7 END (NAMED perc ) , type , Calc_Week , level , UPDATE_ID , UPDATE_TIMESTAMP

FROM abc.perc_table DTP ,  --this has huge data

def.calender cal

WHERE DTP.a = CAL.a; 

4 REPLIES

Re: All amp duplication issue

@doneth please help on this.

Teradata Employee

Re: All amp duplication issue

Please provide the explain. 

Re: All amp duplication issue

Hi,

Sorry for the delayed reply below is the explain please help

1) First, we

lock DXWI_PROD_REF_REFRESH_DB01.TWI0CAL_SMALL_CALENDAR

in view

DXWI_PROD_VIEW_ACCESS.VWI0CAL_SMALL_CALENDAR for access,

and we

lock DXWI_PROD_UK_CRORDER_DB01.TWI0DTP_Daily_Trad_Perc in

view

DXWI_PROD_VIEW_ACCESS.VWI0DTP_DAILY_TRAD_PERC_NEW for access.

2) Next, we do an all-AMPs

RETRIEVE step from

DXWI_PROD_REF_REFRESH_DB01.TWI0CAL_SMALL_CALENDAR in view

DXWI_PROD_VIEW_ACCESS.VWI0CAL_SMALL_CALENDAR by way of an all-rows

scan with no residual conditions into

Spool 2 (all_amps)

(compressed columns allowed), which is

duplicated on all AMPs.

Then we do a SORT to

partition by rowkey. The size of Spool 2 is

estimated with high confidence to be 2,104,704 rows (46,303,488

bytes). The estimated time for this step is 0.11 seconds.

3) We do an all-AMPs

JOIN step from Spool 2 (Last Use) by way of an

all-rows scan, which is joined to

DXWI_PROD_UK_CRORDER_DB01.TWI0DTP_Daily_Trad_Perc in view

DXWI_PROD_VIEW_ACCESS.VWI0DTP_DAILY_TRAD_PERC_NEW with no residual

conditions.

Spool 2 and

DXWI_PROD_UK_CRORDER_DB01.TWI0DTP_Daily_Trad_Perc are joined using

a

dynamic hash join, with a join condition of (

"DXWI_PROD_UK_CRORDER_DB01.TWI0DTP_Daily_Trad_Perc.YEAR_WEEK_NUMBER

= Year_week_number"

) enhanced by dynamic partition elimination.

The input table

DXWI_PROD_UK_CRORDER_DB01.TWI0DTP_Daily_Trad_Perc

will not be cached in memory. The result goes into

Spool 1

(group_amps), which is

built locally on the AMPs. The result

spool file will not be cached in memory. The size of

Spool 1 is

estimated with low confidence to be 40,234,312,990 rows (

3,379,682,291,160 bytes). The estimated time for this step is 13

minutes and 15 seconds.

4) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of

Spool 1 are sent back to the user as the result of

statement 1. The total estimated time is 13 minutes and 15

seconds.

Senior Apprentice

Re: All amp duplication issue

It seems like you unpivot a denormalized table with one column per weekday into seven rows, is this correct? This would result in 133,184,000 * 7 rows.

The small table is duplicated, this is expected. And the Dynamic Hash Join might be the best you can get for joining to a partitioned table.