Need help in tuning a Query

Database
Enthusiast

Need help in tuning a Query

Hi All,

The below Query is facing Spool space issue while executing ...

SEL I.DLVR_POINT_NO, SUM(I.SALES) AS InvoiceAmtLast12, CAST(NULL AS DECIMAL(18,2)) AS InvoiceAmtLast24, 

CAST(NULL AS INTEGER) AS CalcGrowth FROM

VOLUME_TABLE  I

WHERE I.SETL_DT <= (add_months((CURRENT_DATE - extract(day from CURRENT_DATE)+1),0)-1)

AND I.SETL_DT >= add_months(CURRENT_DATE - extract(day from CURRENT_DATE)+1,-12)

GROUP BY 1

UNION

SEL I.DLVR_POINT_NO,  NULL   AS InvoiceAmtLast12, SUM(I.SALES) AS InvoiceAmtLast24, NULL AS CalcGrowth FROM

VOLUME_TABLE I

WHERE I.SETL_DT <= add_months((CURRENT_DATE - extract(day from CURRENT_DATE)+1),-12)-1

AND I.SETL_DT >= add_months(CURRENT_DATE - extract(day from CURRENT_DATE)+1,-24) 

GROUP BY 1

UNION

SEL DISTINCT I.DLVR_POINT_NO, NULL AS InvoiceAmtLast12, NULL AS InvoiceAmtLast24, 1 AS CalcGrowth FROM

VOLUME_TABLE I

WHERE I.SETL_DT < add_months(CURRENT_DATE - extract(day from CURRENT_DATE)+1,-24) 

I tried creating an volatile table to load all the SELECT part of data first but that is also taking longer time and ending up with spool issue.Can anyone please suggest other alternate approach or help in tuning the Query..

P.S : The Volume table contains around 5billion rows..

Thanks..

4 REPLIES
Junior Contributor

Re: Need help in tuning a Query

A quick solution is to change the UNION to UNION all, this avoids the DISTINCT processing.

In best case the table is partitioned by  I.SETL_DT.

You might also try to combine all three selects into one using CASEs:

SEL I.DLVR_POINT_NO
,SUM(CASE WHEN I.SETL_DT <= (ADD_MONTHS((CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1),0)-1)
AND I.SETL_DT >= ADD_MONTHS(CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1,-12)
THEN I.SALES end) AS InvoiceAmtLast12
,SUM(CASE WHEN I.SETL_DT <= ADD_MONTHS((CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1),-12)-1
AND I.SETL_DT >= ADD_MONTHS(CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1,-24)
THEN I.SALES end) AS InvoiceAmtLast24
,MIN(CASE WHEN I.SETL_DT < ADD_MONTHS(CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1,-24)
THEN 1 end) AS CalcGrowth
FROM VOLUME_TABLE I
WHERE I.SETL_DT <= (ADD_MONTHS((CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1),0)-1)
GROUP BY 1
,CASE WHEN I.SETL_DT <= (ADD_MONTHS((CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1),0)-1)
AND I.SETL_DT >= ADD_MONTHS(CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1,-12)
THEN 1
WHEN I.SETL_DT <= ADD_MONTHS((CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1),-12)-1
AND I.SETL_DT >= ADD_MONTHS(CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)+1,-24)
THEN 2
ELSE 3
END

You should runt some tests which one is more efficient on this big table, it will depend on the actual data...

Enthusiast

Re: Need help in tuning a Query

Hi Dieter,

Thank you for your suggestion..UNION ALL is working better than UNION,reducing the run time from 69minutes to 58mins but consuming more spool space of about 1100GB which will not be provided to an USER ID always..

I also tried with sum(CASE WHEN ) condition but the estimated run time is around 90minutes..

Is there any other alternate way to reduce the spool limit ?

Once again thanks for your help

-Lalitha Chintha

Junior Contributor

Re: Need help in tuning a Query

Hi Lalitha,

if you sum up 5 billion rows you will always need a lot spool space :-)

Estimated times are just estimates, actual run time might be totall different.

Wall clock times can't be compared, too, they will vary greatly based on the system load. You need to check the QueryLog for the actual resource usage.

Can you show the indexes/partitioning of that table and the statistics of I.DLVR_POINT_NO? 

Enthusiast

Re: Need help in tuning a Query

you might need to break this down from a single statement into multiple statements.

for the first sql in the union, you might try breaking it down and aggregating by quarters (3 months of data) then bring that into a volatile table and then aggregate the 4 quarters into a year.  This might get you around the spool limits since based on your comments you can't get a higher spool.  Do the same thing for the other sql's in the union and by doing this you can potentially reduce the spool required at the final aggregation step to get the final result.  By making the steps smaller it might end up being faster than running the single step (not as elegant but sometimes function/performance beats looks).

Another issue could be the I.DLVR_POINT_NO isn't very unqiue causing most of the aggregation to be done on a few amps which could be causing spool problems as well by most of the data being on just a handful of amps, so you might look and see if your having skew issues as well with this aggregation.