Tune query

Database
Enthusiast

Tune query

 
3 REPLIES
Enthusiast

Re: Tune query

Explain INSERT INTO    DB1.Table1  
SELECT S.DT
, ‘STG’
, ‘NF’
, S.SID
, S.OID
, S.FID
, S.B_NUM
, NULL
, NULL
, NULL
, NULL
, S.LOB
, CURRENT_DATE
, 999
, CURRENT_TIMESTAMP

FROM    DB1.STAGING1S           
LEFT OUTER JOIN   DB1.STAGING_PRIOR P    
ON S.DT= P.DT
AND S.SID = P.SID
AND S.OID = P.OID
AND S.FID = P.FID
AND COALESCE(S.B_NUM, 'Z') = COALESCE(P.B_NUM, 'Z')

WHERE P.DT IS NULL;
  1) First, we lock a distinct DB1."pseudo table" for read on a
     RowHash to prevent global deadlock for DB1.P.
  2) Next, we lock a distinct DB1."pseudo table" for write on a
     RowHash to prevent global deadlock for
     DB1.Table1.
  3) We lock a distinct DB1."pseudo table" for read on a
     RowHash to prevent global deadlock for DB1.S.
  4) We lock DB1.P for read, we lock
     DB1.Table1for write, and we lock
     DB1.S for read.
  5) We do an all-AMPs JOIN step from DB1.S by way of a RowHash
     match scan with no residual conditions, which is joined to
     DB1.P by way of a RowHash match scan with a condition of (
     "(NOT (DB1.P.OID IS NULL )) AND ((NOT
     (DB1.P.SID IS NULL )) AND (NOT (DB1.P.DT
     IS NULL )))").  DB1.S and DB1.P are left outer
     joined using a merge join, with condition(s) used for non-matching
     on left table ("(NOT (DB1.S.FID IS NULL )) AND
     ((NOT (DB1.S.OID IS NULL )) AND ((NOT
     (DB1.S.SID IS NULL )) AND (NOT (DB1.S.DT
     IS NULL ))))"), with a join condition of ("((( CASE WHEN (NOT
     (DB1.S.B_NUM IS NULL )) THEN (DB1.S.B_NUM)
     ELSE ('Z') END ))= (( CASE WHEN (NOT (DB1.P.B_NUM IS
     NULL )) THEN (DB1.P.B_NUM) ELSE ('Z') END ))) AND
     ((DB1.S.FID = DB1.P.FID) AND
     ((DB1.S.OID = DB1.P.OID) AND
     ((DB1.S.SID = DB1.P.SID) AND
     (DB1.S.DT = DB1.P.DT ))))").  The
     result goes into Spool 2 (all_amps), which is built locally on the
     AMPs.  The size of Spool 2 is estimated with low confidence to be
     887,490 rows (62,124,300 bytes).  The estimated time for this step
     is 0.27 seconds.
  6) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of
     an all-rows scan with a condition of ("DT IS NULL") into
     Spool 1 (all_amps), which is redistributed by the hash code of (
     DB1.S.DT, ‘STG’, 'NEW
     FACILITY', DB1.S.SID (VARCHAR(10), CHARACTER SET LATIN,
     NOT CASESPECIFIC, NAMED SID, FORMAT 'X(10)', NULL),
     DB1.S.OID, DB1.S.FID,
     DB1.S.B_NUM, NULL (VARCHAR(30), CHARACTER SET LATIN,
     NOT CASESPECIFIC), CAST((CURRENT_TIMESTAMP(6)) AS TIMESTAMP(6)))
     to all AMPs.  Then we do a SORT to order Spool 1 by row hash.  The
     size of Spool 1 is estimated with low confidence to be 887,490
     rows (717,979,410 bytes).  The estimated time for this step is
     0.25 seconds.
  7) We do an all-AMPs MERGE into DB1.Table1from
     Spool 1 (Last Use).  The size is estimated with low confidence to
     be 887,490 rows.  The estimated time for this step is 3 minutes
     and 33 seconds.
  8) We spoil the parser's dictionary cache for the table.
  9) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> No rows are returned to the user as the result of statement 1.

Need to tune this query so that estimated time comes below 3 minutes. The PI of the 2 joining tables are the first 3 columns which are being joined. The last join column B_num , however has some skew for B_num = 01 and nulls. I tried converting the Coalesce to a derived table, so that we do not have to using coalese fucntion in join. I tried using union all with the 1 query handling nulls and other not handling it.The table being inserted is multiset with a UPI.

Any suggestions are appreciated.

Enthusiast

Re: Tune query

Experts any suggestions?

Senior Apprentice

Re: Tune query

Why do you care about the estimated time? 

Did you check the actual numbers in the QueryLog?

Based on the explain everything is fine.

The join is on the PI-columns, followed by a redistribution, both estimated to run in half a second.

And the final Merge depends on your target taget table, unless you change the PI to match the source PI there's no way to speed this up.