INSERT INTO Db.TbInserted -- SET table with UPI
, 'CG FLTY'
FROM Db.StgTbl1 S ----SET table with NUPI
INNER JOIN Db.StgTbl2 P -- MULTISET NOPI
ON S.PD_DT= P.PD_DT
AND S.SS_ID = P.SS_ID
AND S.OR_ID = P.OR_ID
AND S.FY_ID = P.FY_ID
AND COALESCE(S.BK_NO, 'Z') = COALESCE(P.BK_NO, 'Z')
WHERE COALESCE(S.OR_TY, 'Z') <> COALESCE(P.OR_TY, 'Z');
1) First, we lock a distinct Db."pseudo table" for
read on a RowHash to prevent global deadlock for
2) Next, we lock a distinct Db."pseudo table" for
write on a RowHash to prevent global deadlock for
3) We lock a distinct Db."pseudo table" for read on
a RowHash to prevent global deadlock for Db.P.
4) We lock Db.S for read, we lock
Db.TbInserted for write, and we lock
Db.P for read.
5) We do an all-AMPs RETRIEVE step from Db.P by way
of an all-rows scan with a condition of ("(NOT
(Db.P.FY_ID IS NULL )) AND ((NOT
(Db.P.OR_ID IS NULL )) AND ((NOT
(Db.P.SS_ID IS NULL )) AND (NOT
(Db.P.PD_DT IS NULL ))))") into Spool 2
(all_amps), which is redistributed by the hash code of (
Db.P.FY_ID) to all AMPs. Then we do a SORT
to order Spool 2 by row hash. The size of Spool 2 is estimated
with high confidence to be 887,493 rows (43,487,157 bytes). The
estimated time for this step is 0.38 seconds.
6) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a
RowHash match scan, which is joined to Db.S by
way of a RowHash match scan with a condition of ("(NOT
(Db.S.OR_ID IS NULL )) AND (NOT
(Db.S.SS_ID IS NULL ))"). Spool 2 and
Db.S are joined using a merge join, with a join
condition of ("(Db.S.PD_DT = PD_DT) AND
((Db.S.SS_ID = SS_ID) AND
((Db.S.OR_ID = OR_ID) AND
((Db.S.FY_ID = FY_ID) AND (((( CASE
WHEN (NOT (Db.S.OR_TY IS NULL )) THEN
(Db.S.OR_TY) ELSE ('Z') END ))<> (( CASE
WHEN (NOT (OR_TY IS NULL )) THEN (OR_TY) ELSE ('Z') END
))) AND ((( CASE WHEN (NOT (Db.S.BK_NO IS NULL
)) THEN (Db.S.BK_NO) ELSE ('Z') END ))= ((
CASE WHEN (NOT (BK_NO IS NULL )) THEN (BK_NO) ELSE ('Z') END
)))))))"). The result goes into Spool 1 (all_amps), which is
redistributed by the hash code of (
'CG FLTY', Db.S.SS_ID (VARCHAR(10),
CHARACTER SET LATIN, NOT CASESPECIFIC, NAMED SS_ID, FORMAT
'X(10)', NULL), Db.S.OR_ID,
'OR_TY', CAST((CURRENT_TIMESTAMP(6)) AS TIMESTAMP(6))) to all
AMPs. Then we do a SORT to order Spool 1 by row hash. The size
of Spool 1 is estimated with index join confidence to be 887,493
rows (717,981,837 bytes). The estimated time for this step is
7) We do an all-AMPs MERGE into Db.TbInserted
from Spool 1 (Last Use). The size is estimated with index join
confidence to be 887,493 rows. The estimated time for this step
is 3 minutes and 33 seconds.
8) We spoil the parser's dictionary cache for the table.
9) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
I need help tuning this query..I need to reduce the estimated time further . I tried following but did not help much:
1. Changing the COALESCE to CASE statement in a subquery
2. Converting the TblInserted to Multiset reduces estimated time but its negligible.
Any suggestion on how to improve further?
- COALESCE is just a shortcut for a CASE, see explain.
- For a UPI table there's no real difference between SET and MULTISET.
- Estimated time doesn't matter, what's the actual runtime?
There's no way to tune this INS/SEL, only DDL changes might help, partitioning target and/or adding PI to stage.