Here is the explain that is straight forward but sometimes the same query runs in 10-15mins and sometimes it takes longer than 4-5hours for the same data volume.
Explain SELECT COOKIE_ID, CAST(MAX(VISIT_START_DTTM) AS DATE) AS LATEST_VISIT_DATE,
MAX(CASE WHEN CAST(VISIT_START_DTTM AS DATE) BETWEEN '2009-05-01' AND '2010-04-30' THEN 'Y'
ELSE 'N' END) AS ACTIVITY_2yearsbefore,
MAX(CASE WHEN CAST(VISIT_START_DTTM AS DATE) BETWEEN '2010-05-01' AND '2011-04-30' THEN 'Y'
ELSE 'N' END) AS ACTIVITY_prevyear,
MAX(CASE WHEN CAST(VISIT_START_DTTM AS DATE) BETWEEN '2011-05-01' AND '2012-04-30' THEN 'Y'
ELSE 'N' END) AS ACTIVITY_currentyear
FROM CDW.VISIT A
WHERE A.LOCATION_ID = 2
AND CAST(VISIT_START_DTTM AS DATE) <= '2012-04-30'
GROUP BY COOKIE_ID
1) First, we lock CDW_Tables.VISIT in view CDW.VISIT for access.
2) Next, we do an all-AMPs SUM step to aggregate from
CDW_Tables.VISIT in view CDW.VISIT by way of an all-rows scan with
a condition of ("(CDW_Tables.VISIT in view CDW.VISIT.Location_Id =
2) AND ((CAST((CDW_Tables.VISIT in view CDW.VISIT.Visit_Start_Dttm)
AS DATE))<= DATE '2012-04-30')") , grouping by field1 (
CDW_Tables.VISIT.Cookie_Id). Aggregate Intermediate Results are
computed globally, then placed in Spool 4. The input table will
not be cached in memory, but it is eligible for synchronized
scanning. The aggregate spool file will not be cached in memory.
The size of Spool 4 is estimated with no confidence to be
141,283,585 rows (7,911,880,760 bytes). The estimated time for
this step is 2 minutes and 23 seconds.
3) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of
an all-rows scan into Spool 2 (group_amps), which is built locally
on the AMPs. The result spool file will not be cached in memory.
The size of Spool 2 is estimated with no confidence to be
141,283,585 rows (6,075,194,155 bytes). The estimated time for
this step is 6.21 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 2 are sent back to the user as the result of
statement 1. The total estimated time is 2 minutes and 29 seconds.
The UPI on the visit table is on VISIT_ID, not sure why it would perform so differently under different cpu loads.
Any ways to tune this query? The table does not have any PPIs and would not make any difference to this query since the CAST(VISIT_START_DTTM AS DATE) <= '2012-04-30' is actually pulling all the data because the min(visit_Start_dttm) in the table itself is 2009-05-01.
Are these queries logged in DBQL?
If yes, check if the resource consumption is really different. You might be faceing system workload issues.
Yes they are in DBQL, what do i really need to look at to understand resource consumption? Do I need to look at HotAmpCPU, CPU Skew, I/O Sku etc? Appreciate your response
No upgrade as far as I know. I looked at the DBQLogTbl and the AMP CPU Time was 18,367. Is this too high?
|AMP CPU Time||MaxAmpCPUTime||parsercputime||totalIOCount||MaxAmpIO||TDWMEstTotalTime|
Let me know your thoughts on this!!
Looks like it could be a statistics problem to me (no confidence everywhere). Collect the necessary stats, rerun the query, then see if your plan is much better.
you posted only one query log entry.
My understanding is that different runs differ in execution time / ellapsedtimes. Do they differ in AMPCPUtime, totalIOCount etc.?
how could this plan be improved? Its doning a plain aggregation, no joins etc. Stats might give better estimates but will not change the plan.
I would say it is just generally a good idea to do it anyway. Whether it improves the query... who knows... pretty low hanging fruit though.
It is evident from the plan that the query is unable to use the PI of the table and is pulling out the entire table into spool which is global, meaning the data will first be pulled out into a spool and then it will be processed. I would try out following options:
(a) see if cookie id can be made the PI of the table without skewing it too much, this will make all calculations happen locally in the AMP and run very fast even with high data volumes owing to MPP. This may be the only real option to try, rest are all guesses with a very slim chance of working
(b) if PI cannot be changed, a PPI with date ranges may work here if it is aligned to your case statement filters, although I have a strong feeling it won't work
(c) lastly, try converting your case statements into a small control table that stores the boundary values for each case and is joined with your master table. With correct stats the optimizer should replicate this table across all AMPs and partially restrict the data that goes into the spool. Again, the issue here is if the filter conditions will pull out the entire table - this would make no difference
I'm curious to hear more tuning options from others based on only the explain plan.