Re write a query

Database

Re write a query

Hi,

Can you please help me re-write below query as it is highly skewd and having very high Impact cpu.

DELETE FROM xyz.#VAU_DETAIL a 

where exists 

(select 1 from  xyz.VAU_DETAIL b 

 where b.O_TRAN_ID = a.O_TRAN_ID

and b.ER_ID not = a.ER_ID

and a.AC_DT >= b.AC_DT);

3 REPLIES
N/A

Re: Re write a query

Can you post the EXPLAIN output please? 

Can you give as well the top result rows of:

SELECT O_TRAN_ID,COUNT(*) FROM xyz.#VAU_DETAIL GROUP BY 1 ORDER BY 2 DESC ?

Thanks

Roland

Roland Wenzlofsky

Re: Re write a query

Explain:-

1) First, we lock a distinct xyz."pseudo table" for write on

     a RowHash to prevent global deadlock for

     xyz.#VAU_DETAIL.

  2) Next, we lock xyz.#VAU_DETAIL in view

     #V_AOC_USAGE_DETAIL for write.

  3) We execute the following steps in parallel.

       1) We do an all-AMPs RETRIEVE step from

          xyz.#VAU_DETAIL in view #V_AU_DETAIL

          by way of an all-rows scan with no residual conditions into

          Spool 2 (all_amps), which is redistributed by the hash code

          of (xyz.#VAU_DETAIL.O_TRAN_ID) to

          all AMPs.  The size of Spool 2 is estimated with high

          confidence to be 1,922,203 rows (84,576,932 bytes).  The

          estimated time for this step is 0.96 seconds.

       2) We do an all-AMPs RETRIEVE step from

          xyz.#VAU_DETAIL by way of an all-rows scan

          with no residual conditions locking for access into Spool 4

          (all_amps), which is redistributed by the hash code of (

          xyz.#VAU_DETAIL.O_TRAN_ID) to all

          AMPs.  Then we do a SORT to order Spool 4 by the sort key in

          spool field1.  The size of Spool 4 is estimated with high

          confidence to be 1,881,422 rows (152,395,182 bytes).  The

          estimated time for this step is 0.23 seconds.

  4) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of

     an all-rows scan into Spool 3 (all_amps), which is built locally

     on the AMPs.  The size of Spool 3 is estimated with high

     confidence to be 1,881,422 rows (152,395,182 bytes).

  5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an

     all-rows scan, which is joined to Spool 3 (Last Use) by way of an

     all-rows scan.  Spool 2 and Spool 3 are joined using a single

     partition inclusion hash join, with a join condition of (

     "(AC_DT >= AC_DT) AND ((ER_ID <> ER_ID) AND

     ((O_TRAN_ID = O_TRAN_ID) AND (NOT

     (O_TRAN_ID IS NULL ))))").  The result goes into Spool 1

     (all_amps), which is redistributed by the hash code of (

     xyz.#VAU_DETAIL.ROWID) to all AMPs.  Then we do a

     SORT to order Spool 1 by row hash and the sort key in spool field1

     eliminating duplicate rows.  The size of Spool 1 is estimated with

     index join confidence to be 1,922,203 rows (34,599,654 bytes).

     The estimated time for this step is 27.94 seconds.

  6) We do an all-AMPs MERGE DELETE to xyz.#VAU_DETAIL

     in view #V_AU_DETAIL from Spool 1 (Last Use) via the row id.

     The size is estimated with index join confidence to be 1,922,203

     rows.  The estimated time for this step is 3 minutes and 20

     seconds.

  7) We spoil the parser's dictionary cache for the table.

  -> No rows are returned to the user as the result of statement 1.

2: It has 1,851,017 unique values in O_TRAN_ID column and Table also has same count of rows i.e 1,890,444.

3: O_TRAN_ID VARCHAR(20)

4: ER_ID UPI

5: STATS IS UPTO DATE ON ALL THE COLUMNS IN WHERE CLAUSE. 

Re: Re write a query

"and b.ER_ID not = a.ER_ID" is what's killing your performance.. rewrite it as a loj on a.tran_id = b.tran_id where a.ac_dt > b.ac_dt and b.er_id is null (assuming "a" is the left table)