Data skew /qry running long

Database
Enthusiast

Data skew /qry running long


Hi Dieter,
Given is the qry and explain plan and the query runs long. One of the tables is huge here but I still suspect that the view is a problem especially because it reditributes by hash code in step 2 which takes 22 mins. Any idea why this step takes so long? And what could be done to fix.
Another qsn was when we join multiple columns, should we check for data skew for each join column separately or all join columns at once as shown :
SELECT HASHAMP(HASHBUCKET(HASHROW(col1,col2,col3))) ,COUNT(*)

FROM table WHERE x is not null
GROUP BY 1
ORDER BY 2 DESC;

7 REPLIES
Enthusiast

Re: Data skew /qry running long

Explain SELECT CIS_ACCT_ID,
              a.CO_NR ,
               a.CIS_ACCT_PD_CD,
              T.tran_id
          
FROM   DB1.TEMP  a    
 
 JOIN VIEWDB2..XREF  T    
 ON a.CO_NR  = T.CO_NR
 AND a.PD_CD = T.PD_CD
 AND a.ACCT = T.ACCT_ID
 WHERE T.TRN_DT BETWEEN DATE'2014-07-01' AND DATE'2014-10-30'
 AND T.PST_TRN_CD IS NOT NULL

  1) First, we lock DB1.a for access, we lock
     TBLDB.DLY_ACTin view
     VIEWDB2..XREF  for access, we lock
     TBLDB.DLY_XREF in view
     VIEWDB2..XREF  for access, and we lock
     TBLDB.DLY_ITM in view
     VIEWDB2..XREF  for access.
  2) Next, we execute the following steps in parallel.
       1) We do an all-AMPs RETRIEVE step from 92 partitions of
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF  with a condition of (
          "(NOT (TBLDB.DLY_ACTin view
          VIEWDB2..XREF .PST_TRN_CD IS NULL
          )) AND ((NOT (TBLDB.DLY_ACTin view
          VIEWDB2..XREF .PRDT_CD IS NULL ))
          AND ((TBLDB.DLY_ACTin view
          VIEWDB2..XREF .TRN_DT >= DATE
          '2014-07-01') AND (TBLDB.DLY_ACTin view
          VIEWDB2..XREF .TRN_DT <= DATE
          '2014-10-30')))") into Spool 1 (all_amps), which is
          redistributed by the hash code of (TBLDB.DLY_ACTin
          view
          VIEWDB2..XREF .SIT_NM_TNT,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .SIT_NM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .FINT__NM_FORTENT,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .FINT__NM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .BENE_CURR_TYP_CD,
          TBLDB.DLY_TRAN_ACCT.BENE_CURR_AM, TBLDB.DLY_TRAN_ACCT
          in view
          VIEWDB2..XREF .BENE_PTY_ADDR_3,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .BENE_PTY_ADDR_2,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .BENE_PTY_ADDR_1,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .RCVNG_CNTRY_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .RCVNG__NM_FORTENT,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .RCVNG__NM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .RCVNG_PTY_ACCT_NR,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .RCVNG_ACCT_NR,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .RCVNG_ACCT_NM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_CURR_TYP_CD,
          TBLDB.DLY_TRAN_ACCT.ORGNT_CURR_AM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_CNTRY_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_PTY_ADDR_3,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_PTY_ADDR_2,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_PTY_ADDR_1,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT__NM_FORTENT,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT__NM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_PTY_ACCT_NR,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_ACCT_NR,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_ACCT_NM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .SRC_SYSTEM_APPL_ID,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .RVRSL_IN,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ACH_SEC,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .MISC_SRC_CD,
          TBLDB.DLY_TRAN_ACCT.LOAD_TS,
          TBLDB.DLY_TRAN_ACCT.SRC_SYSTEM_NR,
          TBLDB.DLY_TRAN_ACCT.EXTND_SRC_SYSTEM_NR,
          TBLDB.DLY_TRAN_ACCT.UNIT_NR,
          TBLDB.DLY_TRAN_ACCT.UNIT_ID, TBLDB.DLY_ACTin
          view VIEWDB2..XREF .TRAN_TYP_NM,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .TRAN_NARRATION_TX,
          TBLDB.DLY_TRAN_ACCT.TRAN_DT, (CASE WHEN
          ((TBLDB.DLY_ACTin view
          VIEWDB2..XREF .UNIT_NR IS NULL) OR
          (TBLDB.DLY_ACTin view
          VIEWDB2..XREF .UNIT_NR = 0 )) THEN
          (NULL) ELSE ('0'||TBLDB.DLY_ACTin view
          VIEWDB2..XREF .BRANCH_NR) END),
          TBLDB.DLY_TRAN_ACCT.TRAN_AM, TBLDB.DLY_ACTin
          view VIEWDB2..XREF .SPLIT_IN,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .SERIAL_NR,
          TBLDB.DLY_TRAN_ACCT.SEQ_NR,
          TBLDB.DLY_TRAN_ACCT.RULE_NR, TBLDB.DLY_ACTin
          view VIEWDB2..XREF .PST_TRAN_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ORGNT_SOR_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .NET_IMPCT_IN,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .LOW_ACCT_TYP_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .HIGH_RISK_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .HIGH_ACCT_TYP_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .DR_CR_CD,
          TBLDB.DLY_TRAN_ACCT.CST_NR,
          TBLDB.DLY_TRAN_ACCT.CST_CO_NR, NULL (CHAR(3),
          CHARACTER SET LATIN, NOT CASESPECIFIC), NULL (CHAR(23),
          CHARACTER SET LATIN, NOT CASESPECIFIC), NULL (SMALLINT),
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .BRANCH_NR,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .BASE_IN,
          SUBSTR(TBLDB.DLY_ACTin view
          VIEWDB2..XREF .AUX_SRC_TRAN_ID ,9 ,12),
          SUBSTR(TBLDB.DLY_ACTin view
          VIEWDB2..XREF .AUX_SRC_TRAN_ID ,1 ,8),
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .AUX_SRC_TRAN_ID,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .AML_SRC_CD,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ABA_NR,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .ACT_NR,
          TBLDB.DLY_ACTin view
          VIEWDB2..XREF .PRODUCT_CD,
          TBLDB.DLY_TRAN_ACCT.COMPANY_ID,
          TBLDB.DLY_TRAN_ACCT.PROCESS_DT,
          TBLDB.DLY_TRAN_ACCT.TRAN_ID) to all AMPs.  The input
          table will not be cached in memory, but it is eligible for
          synchronized scanning.  The result spool file will not be
          cached in memory.  The size of Spool 1 is estimated with no
          confidence to be 4,364,185,801 rows (8,610,538,585,373 bytes).
          The estimated time for this step is 22 minutes and 6 seconds.
       2) We do an all-AMPs RETRIEVE step from 92 partitions of
          TBLDB.DLY_XREF in view
          VIEWDB2..XREF  with a condition of (
          "(NOT (TBLDB.DLY_XREF in view
          VIEWDB2..XREF .PRDT_CD IS NULL ))
          AND ((TBLDB.DLY_XREF in view
          VIEWDB2..XREF .TRN_DT >= DATE
          '2014-07-01') AND (TBLDB.DLY_XREF in view
          VIEWDB2..XREF .TRN_DT <= DATE
          '2014-10-30'))") into Spool 2 (all_amps) (compressed columns
          allowed), which is built locally on the AMPs.  Then we do a
          SORT to partition Spool 2 by rowkey.  The size of Spool 2 is
          estimated with low confidence to be 518,951,521 rows (
          50,338,297,537 bytes).  The estimated time for this step is
          31.26 seconds.
  3) We execute the following steps in parallel.
       1) We do an all-AMPs JOIN step from 92 partitions of
          TBLDB.DLY_ITM in view
          VIEWDB2..XREF  by way of a RowHash
          match scan with a condition of ("(NOT
          (TBLDB.DLY_ITM in view
          VIEWDB2..XREF .PST_TRN_CD IS NULL
          )) AND ((TBLDB.DLY_ITM in view
          VIEWDB2..XREF .TRN_DT >= DATE
          '2014-07-01') AND (TBLDB.DLY_ITM in view
          VIEWDB2..XREF .TRN_DT <= DATE
          '2014-10-30'))"), which is joined to Spool 2 (Last Use) by
          way of a RowHash match scan.  TBLDB.DLY_ITM and
          Spool 2 are joined using a rowkey-based merge join, with a
          join condition of ("(TRN_DT =
          TBLDB.DLY_ITM.TRAN_DT) AND (ITEM_TRAN_ID =
          TBLDB.DLY_ITM.TRAN_ID)").  The result goes into
          Spool 1 (all_amps), which is redistributed by the hash code
          of (NULL (VARCHAR(35), CHARACTER SET LATIN, NOT CASESPECIFIC),
          NULL (VARCHAR(35), CHARACTER SET LATIN, NOT CASESPECIFIC),
          NULL (VARCHAR(35), CHARACTER SET LATIN, NOT CASESPECIFIC),
          NULL (VARCHAR(35), CHARACTER SET LATIN, NOT CASESPECIFIC),
          NULL (CHAR(3), CHARACTER SET LATIN, NOT CASESPECIFIC), NULL
          (DECIMAL(18,2)), NULL (VARCHAR(40), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (VARCHAR(40), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (VARCHAR(40), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (CHAR(2), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (VARCHAR(35), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (VARCHAR(35), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (VARCHAR(35), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (VARCHAR(29), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (VARCHAR(50), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (CHAR(3), CHARACTER SET LATIN, NOT
          CASESPECIFIC), NULL (DECIMAL(18,2)), NULL (CHAR(2), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(40), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(40), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(40), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(35), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(35), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(35), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(29), CHARACTER
          SET LATIN, NOT CASESPECIFIC), NULL (VARCHAR(50), CHARACTER
          SET LATIN, NOT CASESPECIFIC),
          TBLDB.DLY_ITM.SRC_SYSTEM_APPL_ID,
          TBLDB.DLY_ITM.RVRSL_IN,
          TBLDB.DLY_ITM.ACH_SEC,
          TBLDB.DLY_ITM.MISC_SRC_CD,
          TBLDB.DLY_TRAN_XREF.LOAD_TS,
          TBLDB.DLY_ITM.SRC_SYSTEM_NR,
          TBLDB.DLY_ITM.EXTND_SRC_SYSTEM_NR,
          TBLDB.DLY_ITM.UNIT_NR,
          TBLDB.DLY_ITM.UNIT_ID,
          TBLDB.DLY_ITM.TRAN_TYP_NM,
          TBLDB.DLY_ITM.TRAN_NARRATION_TX,
          TBLDB.DLY_TRAN_XREF.TRAN_DT,
          TBLDB.DLY_ITM.TRAN_BRANCH_AU_NR,
          TBLDB.DLY_ITM.TRAN_AM,
          TBLDB.DLY_ITM.SPLIT_IN,
          TBLDB.DLY_ITM.SERIAL_NR,
          TBLDB.DLY_ITM.SEQ_NR,
          TBLDB.DLY_ITM.RULE_NR,
          TBLDB.DLY_ITM.PST_TRAN_CD,
          TBLDB.DLY_ITM.ORGNT_SOR_CD,
          TBLDB.DLY_ITM.NET_IMPCT_IN,
          TBLDB.DLY_TRAN_XREF.LOW_ACCT_TYP_CD,
          TBLDB.DLY_ITM.HIGH_RISK_CD,
          TBLDB.DLY_TRAN_XREF.HIGH_ACCT_TYP_CD,
          TBLDB.DLY_ITM.DR_CR_CD,
          TBLDB.DLY_TRAN_XREF.CST_NR,
          TBLDB.DLY_TRAN_XREF.CST_CO_NR,
          TBLDB.DLY_ITM.CPRTY_ACCT_PD_CD,
          TBLDB.DLY_ITM.CPRTY_ACCT_NR,
          TBLDB.DLY_ITM.CPRTY_ACCT_CO_NR,
          TBLDB.DLY_ITM.BRANCH_NR,
          TBLDB.DLY_ITM.BASE_IN,
          SUBSTR(TBLDB.DLY_ITM.AUX_SRC_TRAN_ID ,9 ,12),
          SUBSTR(TBLDB.DLY_ITM.AUX_SRC_TRAN_ID ,1 ,8),
          TBLDB.DLY_ITM.AUX_SRC_TRAN_ID,
          TBLDB.DLY_ITM.AML_SRC_CD,
          TBLDB.DLY_ITM.ABA_NR,
          TBLDB.DLY_TRAN_XREF.ACT_NR,
          TBLDB.DLY_TRAN_XREF.PRODUCT_CD,
          TBLDB.DLY_TRAN_XREF.COMPANY_ID,
          TBLDB.DLY_TRAN_XREF.PROCESS_DT,
          TBLDB.DLY_TRAN_XREF.ITEM_TRAN_ID) to all AMPs.  Then we
          do a SORT to order Spool 1 by the sort key in spool field1
          eliminating duplicate rows.  The size of Spool 1 is estimated
          with no confidence to be 4,418,883,799 rows (
          8,718,457,735,427 bytes).  The estimated time for this step
          is 26 minutes and 59 seconds.
       2) We do an all-AMPs RETRIEVE step from DB1.a by way
          of an all-rows scan with a condition of ("(NOT
          (DB1.a.ACCT IS NULL )) AND (NOT
          (DB1.a.PD_CD IS NULL ))") into Spool 4
          (all_amps) (compressed columns allowed), which is duplicated
          on all AMPs.  The size of Spool 4 is estimated with high
          confidence to be 2,740,800 rows (87,705,600 bytes).  The
          estimated time for this step is 0.16 seconds.
  4) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
     an all-rows scan with a condition of ("(NOT (T.PST_TRN_CD IS
     NULL )) AND ((NOT (T.ACCT IS NULL )) AND ((NOT (T.PD_CD
     IS NULL )) AND ((T.TRN_DT <= DATE '2014-10-30') AND (T.TRN_DT >=
     DATE '2014-07-01'))))") into Spool 5 (all_amps) (compressed
     columns allowed), which is built locally on the AMPs.  The size of
     Spool 5 is estimated with no confidence to be 4,418,883,799 rows (
     220,944,189,950 bytes).  The estimated time for this step is 56.29
     seconds.
  5) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an
     all-rows scan, which is joined to Spool 5 (Last Use) by way of an
     all-rows scan.  Spool 4 and Spool 5 are joined using a single
     partition hash join, with a join condition of ("(PD_CD =
     PD_CD ) AND ((ACCT = ACCT_ID) AND ((CO_NR
     (FLOAT, FORMAT '-9.99999999999999E-999'))= (CO_NR  )))").
     The result goes into Spool 3 (group_amps), which is built locally
     on the AMPs.  The size of Spool 3 is estimated with no confidence
     to be 66,552 rows (3,127,944 bytes).  The estimated time for this
     step is 18.14 seconds.
  6) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 3 are sent back to the user as the result of
     statement 1.  The total estimated time is 50 minutes and 20
     seconds.
Enthusiast

Re: Data skew /qry running long

Just my opinion:

I am not Dieter. The underlying view may have transformation, formats...wher it may need to be fine-tuned

What happens if we run DIAGNOSTIC HELPSTATS ON FOR SESSION;

for the explain and get infos on what stats can be collected.

What happens if you query qrylog as FIRSTSTEPTIME -STARTTIME ? It may take time there too. 

Enthusiast

Re: Data skew /qry running long

Stats are uptodate. I have tried running the query at different times, so the concurrency on prod system is not an issue.

Senior Apprentice

Re: Data skew /qry running long

You need to show the DDL of the view (and the base tables PPI). 

There's a UNION in it which causes that huge redistribution, hopefully you can replace it with a UNION ALL.

Enthusiast

Re: Data skew /qry running long

SELECT HASHAMP(HASHBUCKET(HASHROW(col1,col2,col3))) ,COUNT(*) 

FROM table WHERE x is not null
 GROUP BY 1
 ORDER BY 2 DESC;

Another question was when we join multiple columns, should we check for data skew separately on each join column or all on the columns as show below:

Enthusiast

Re: Data skew /qry running long

Dieter, can you please advise.

Senior Apprentice

Re: Data skew /qry running long

If you think you need to check for skew in a join you should do it on the column combination.