Partition Elimination not happening in Fact through Time Dimension filter.

Database

Partition Elimination not happening in Fact through Time Dimension filter.

Hi Folks,

I have a case in my Reporting Layer where a huge fact is accessed be passing filters through the time dimension.The Problem I am facing is that when I give a single date filter in the dimension using an IN clause its dynamically passing the partition inside the Derived Table which has the FACT Table.

When I pass two dates then it dosen't do that and does a full table scan of the Fact. Its a query generated from BO so, I can not change it. I have tried DIAGNOSTIC HELPSTATS as well with no success.

The Query is given below. I have modified the query a lot so, that people can understand that and reduced the Projected columns as well. Please observe the 3rd and 4th Line from last.

The select with 1 date ( '2012-12-26') is actually doing a partition Elimination inside the CATG_STORE_AGG Derived table which is perfect for me. EXPLAIN for this step

We do an all-AMPs SUM step to aggregate from a single

          partition of MIM_TBL.RETAIL_SHIPMENT_T with a condition of (

          "MIM_TBL.RETAIL_SHIPMENT_T.Shipment_Dt = DATE '2012-12-26'")

          with a residual condition of ("(NOT

          (MIM_TBL.RETAIL_SHIPMENT_T.Catg_Id IS NULL )) AND

          ((MIM_TBL.RETAIL_SHIPMENT_T.Shipment_Dt = DATE '2012-12-26')

          AND (NOT (MIM_TBL.RETAIL_SHIPMENT_T.Rtl_Id IS NULL )))")

          , grouping by field1 ( MIM_TBL.RETAIL_SHIPMENT_T.Rtl_Id

          ,MIM_TBL.RETAIL_SHIPMENT_T.Catg_Id

          ,MIM_TBL.RETAIL_SHIPMENT_T.Shipment_Dt).  Aggregate

          Intermediate Results are computed locally, then placed in

          Spool 4.  The size of Spool 4 is estimated with low

          confidence to be 1,424 rows (66,941 bytes).  The estimated

          time for this step is 0.01 seconds.

The select with 2 dates  ( '2012-12-26', '2012-11-27') is not doing that inside the CATG_STORE_AGG Derived table and going for the complete table scan. EXPLAIN for this step

 We do an all-AMPs SUM step to aggregate from

          MIM_TBL.RETAIL_SHIPMENT_T by way of an all-rows scan with a

          condition of ("(NOT (MIM_TBL.RETAIL_SHIPMENT_T.Catg_Id IS

          NULL )) AND (NOT (MIM_TBL.RETAIL_SHIPMENT_T.Rtl_Id IS NULL ))")

          , grouping by field1 ( MIM_TBL.RETAIL_SHIPMENT_T.Rtl_Id

          ,MIM_TBL.RETAIL_SHIPMENT_T.Catg_Id

          ,MIM_TBL.RETAIL_SHIPMENT_T.Shipment_Dt).  Aggregate

          Intermediate Results are computed locally, then placed in

          Spool 4.  The input table will not be cached in memory, but

          it is eligible for synchronized scanning.  The size of Spool

          4 is estimated with low confidence to be 31,542,827 rows (

          1,482,512,869 bytes).  The estimated time for this step is 14

          minutes and 49 seconds.

Query

SELECT

  STARS_CALENDAR_A.Week_Begin_Dt,

  sum(case when ( STARS_CALENDAR_A.cal_dt )=( CURR_YAGO_DT.yago_dt ) then cast(RETAIL_SHIPMENT_CY.Tot_Vol as decimal(18)) end ),

  sum(case when ( STARS_CALENDAR_A.cal_dt )=( CURR_YAGO_DT.yago_dt ) then cast(RETAIL_SHIPMENT_CY.Tot_Qtd_Vol as decimal(18)) end )

FROM

MIM.RETAIL_SHIPMENT  RETAIL_SHIPMENT_CY

INNER JOIN (

  select a.cal_dt, a.cal_dt as yago_dt

from Mim.stars_calendar a

where a.cal_dt>='2009-12-26'

                and a.cal_dt=a.week_end_dt

union

select b.cal_dt,b.yago_week_end_dt as yago_dt

from Mim.stars_calendar b

where b.cal_dt>='2009-12-26'

and b.cal_dt=b.week_end_dt

  )  CURR_YAGO_DT

  ON (CURR_YAGO_DT.yago_dt=RETAIL_SHIPMENT_CY.Shipment_Dt)

   INNER JOIN MIM.STARS_CALENDAR  STARS_CALENDAR_A

   ON (STARS_CALENDAR_A.Cal_Dt=STARS_CALENDAR_A.Week_End_Dt 

   AND  STARS_CALENDAR_A.Cal_Dt=CURR_YAGO_DT.Cal_Dt)

   INNER JOIN (

  select RTL_ID RetailID,

       CATG_ID CATGID,

       Shipment_Dt ShipmentDate,

       Sum(Tot_Vol) as Weekly_Volume,

       Sum(Tot_Ytd_Vol)as Ytd_Volume,

       Sum(Tot_Qtd_Vol)as Qtd_Volume,

       Sum(Tot_4wk_Vol)as "4wk_Volume",

       Sum(Tot_13wk_Vol)as "13wk_Volume",

       Sum(Tot_52wk_Vol) as "52wk_Volume"

       from Mim.retail_shipment

       GROUP BY 1,2,3

  )  CATG_STORE_AGG

  ON (RETAIL_SHIPMENT_CY.Shipment_Dt=CATG_STORE_AGG.ShipmentDate

  AND RETAIL_SHIPMENT_CY.CATG_ID=CATG_STORE_AGG.CATGID

  And RETAIL_SHIPMENT_CY.RTL_ID=CATG_STORE_AGG.RetailID)

WHERE

MIM.PRODUCT.Catg_Id  in (1,2,3,4,5)

AND   STARS_CALENDAR_A.cal_dt  IN  ( {d '2012-07-07'}  )

AND   MIM.RETAIL.Territory_Id  IN  ( '330804'  )

AND CATG_STORE_AGG.ShipmentDate  IN

(

--Select cal_dt FROM MIM.STARS_CALENDAR WHERE CAL_DT IN ( '2012-12-26', '2012-11-27')

Select cal_dt FROM MIM.STARS_CALENDAR WHERE CAL_DT IN ( '2012-12-26')

)

GROUP BY   1

Would appreciate any help if someone has faced a similar issue.

Thanks

Manik

1 REPLY

Re: Partition Elimination not happening in Fact through Time Dimension filter.

Hi Manik

I faced similar issue in ETL, but solved using stored procedures.

In my case Partition elimination->join and full table scan>join had time difference of huge 30 Min. 

Filter,

PARTND_RANGE_COLMN BETWEEN '2012-01-01' AND '2012-01-31' 

this goes for partition elimination > join(19 to 30 Sec)

Filter,

PARTND_RANGE_COLMN BETWEEN TABLE_ONE.START_DATE AND TABLE_ONE.END_DATE

this goes for full table scan > join(25 to 30 Min)

TABLE_ONE has only one row with same start and end date.(all stats collected)

Pls share if you find any solution..