Why the Optimiser does not run these STATS in parallel ?

Database
Enthusiast

Why the Optimiser does not run these STATS in parallel ?

I am ranking my Product Level Data on the basis of higher grains of Products (Product=>BrandFamily=>Category=>Manufacturer). If all the Rank functions are indeoendent then Why Teradata does not run these steps in Parallel and not in Sequence. I have pasted below the Query and Explain. Can any one tell me if there is a better way of running these in Parallel

I have just mentiioned 5  of these Ranks. I actually have 30 of these Rank Attributes. We need these for Analytical reasons. Any help would be really appreciated

Thanks...Manik

SELECT
Region_Id
,CATG_ID
,MFG_ID
,BRND_FMLY_ID
,Material_ID
,Prod_Id
,Tot_Vol
,Neg_Vol
,CASE
WHEN
ROW_NUMBER() OVER (PARTITION BY Region_Id , Catg_Id,Store_Id ORDER BY Store_Id) = 1
THEN 1
ELSE 0
END AS R_C_RC_Dense_Rnk

,CASE
WHEN
ROW_NUMBER() OVER (PARTITION BY Region_Id , Mfg_Id,Store_Id ORDER BY Store_Id) = 1
THEN 1
ELSE 0
END AS R_M_RC_Dense_Rnk

,CASE
WHEN
ROW_NUMBER() OVER (PARTITION BY Region_Id , BRND_FMLY_ID,Store_Id ORDER BY Store_Id) = 1
THEN 1
ELSE 0
END AS R_B_RC_Dense_Rnk

,CASE
WHEN
ROW_NUMBER() OVER (PARTITION BY Region_Id , Material_Id ,Store_Id ORDER BY Store_Id) = 1
THEN 1
ELSE 0
END AS R_R_RC_Dense_Rnk

,CASE
WHEN
ROW_NUMBER() OVER (PARTITION BY Region_Id , Prod_Id ,Store_Id ORDER BY Store_Id) = 1
THEN 1
ELSE 0
END AS R_I_RC_Dense_Rnk

FROM MIM_TMP.RETAIL_SHIPMENT_DENSE
1) First, we lock a distinct MIM_TMP."pseudo table" for read on a
RowHash to prevent global deadlock for
MIM_TMP.RETAIL_SHIPMENT_DENSE.
2) Next, we lock MIM_TMP.RETAIL_SHIPMENT_DENSE for read.
3) We do an all-AMPs STAT FUNCTION step from
MIM_TMP.RETAIL_SHIPMENT_DENSE by way of an all-rows scan with no
residual conditions into Spool 5 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 3 (all_amps), which is built locally on the AMPs. The
size is estimated with low confidence to be 11,373,072 rows (
1,614,976,224 bytes).
4) We do an all-AMPs STAT FUNCTION step from Spool 3 (Last Use) by
way of an all-rows scan into Spool 8 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 7 (all_amps), which is built locally on the AMPs. The
size is estimated with low confidence to be 11,373,072 rows (
1,705,960,800 bytes).
5) We do an all-AMPs STAT FUNCTION step from Spool 7 (Last Use) by
way of an all-rows scan into Spool 11 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 10 (all_amps), which is built locally on the AMPs. The
size is estimated with low confidence to be 11,373,072 rows (
2,206,375,968 bytes).
6) We do an all-AMPs STAT FUNCTION step from Spool 10 (Last Use) by
way of an all-rows scan into Spool 14 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 13 (all_amps), which is built locally on the AMPs. The
size is estimated with low confidence to be 11,373,072 rows (
1,751,453,088 bytes).
7) We do an all-AMPs STAT FUNCTION step from Spool 13 (Last Use) by
way of an all-rows scan into Spool 17 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 1 (group_amps), which is built locally on the AMPs.
The size is estimated with low confidence to be 11,373,072 rows (
2,047,152,960 bytes).
8) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
2 REPLIES
Senior Apprentice

Re: Why the Optimiser does not run these STATS in parallel ?

Only OLAP functions with the same PARTITON/ORDER BY can be calculated in a single step and your ROW_NUMBERs got the same ORDER BY (which is just a dummy order, btw), but different PARTTION.

It's similar to multiple COUNT(DISTINCT), which are easy to write, but really hard to calculate.

Dieter

Enthusiast

Re: Why the Optimiser does not run these STATS in parallel ?

Thanks Dieter, But could you please suggest any way to do this in a faster manner. I need to calculate these Ranks to calculate the Store Selling Metric which is non additive across the Product Hierarchy. COUNT(DISTINCT) is too expensive..

I have 6 of these RANK columns.

Is there a way I can create three tables with 2 Ranks each and update these Three tables in parallel. Will they complete in equal time ? Not sure but just asking for some advice

If any body else has tackled this problem in a better way

Thanks