Performance of TRML SQL query

Database

Performance of TRML SQL query

Hi all,

The query below is causing performance issues, average run time is 47 mins, for joining a 63m row table to a 61m row table .

I have tried partitioning on the domain_id and a join index with different index , but to no avail.

The source keys are unique on both tables, but changing to UPI doesnt make any difference either.

Also, can someone tell me if the length UDF is the same as CHAR_LENGTH or OCTET_LENGTH.

Our machine is 1 node, 23 AMP , 670c with 14.10.

Any ideas would be grateful.

CREATE SET TABLE sys_reporting.bk03_ev ,NO FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

     (

      SOURCE_KEY VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,

      DOMAIN_ID BYTEINT NOT NULL,

      KEY_SET_ID SMALLINT NOT NULL,

      EDW_KEY BIGINT NOT NULL,

      START_DATE DATE FORMAT 'YYYY-MM-DD' NOT NULL,

      END_DATE DATE FORMAT 'YYYY-MM-DD' NOT NULL DEFAULT DATE '9998-12-31' COMPRESS (DATE '9998-12-31'),

      RECORD_DELETED_FLAG BYTEINT NOT NULL DEFAULT 0  COMPRESS (0 ,1 ),

      CTL_ID SMALLINT NOT NULL,

      PROCESS_NAME VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,

      PROCESS_ID INTEGER NOT NULL,

      UPDATE_PROCESS_NAME VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ,

      UPDATE_PROCESS_ID INTEGER COMPRESS )

 PRIMARY INDEX (SOURCE_KEY);

 CREATE MULTISET TABLE sys_reporting.st_TRML_DAILY_PRINCIPAL ,NO FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

     (

      SOURCE_KEY VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,

      BALANCE VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      AMOUNT VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      POST_DATE VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      VALUE_DATE VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      TRANSACTION_TYPE VARCHAR(64) CHARACTER SET UNICODE CASESPECIFIC,

      SOURCE_CODE VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      REAL_OR_NOTIONAL_FLAG VARCHAR(64) CHARACTER SET UNICODE CASESPECIFIC,

      TCT_ID VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      CCY_ID VARCHAR(64) CHARACTER SET UNICODE CASESPECIFIC,

      NOMINAL_INCREASE VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      NOMINAL_DECREASE VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      AMORT_AMOUNT VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      RPL VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      SETTLEMENT_DATE VARCHAR(64) CHARACTER SET UNICODE CASESPECIFIC,

      EXCHANGE_RATE VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,

      POSTING_COUNT VARCHAR(64) CHARACTER SET UNICODE CASESPECIFIC,

      START_DATE DATE FORMAT 'YYYY-MM-DD' NOT NULL,

      END_DATE DATE FORMAT 'YYYY-MM-DD' COMPRESS ,

      RECORD_DELETED_FLAG BYTEINT COMPRESS ,

      CTL_ID SMALLINT NOT NULL COMPRESS 12 ,

      FILE_ID SMALLINT NOT NULL COMPRESS 5 ,

      PROCESS_NAME CHAR(30) CHARACTER SET LATIN CASESPECIFIC NOT NULL COMPRESS 'LD120005_01_ST_TRML_DAILY_PR  ',

      PROCESS_ID INTEGER NOT NULL,

      UPDATE_PROCESS_NAME CHAR(30) CHARACTER SET LATIN CASESPECIFIC COMPRESS ,

      UPDATE_PROCESS_ID INTEGER COMPRESS ,

      START_TS TIMESTAMP(6) COMPRESS ,

      END_TS TIMESTAMP(6) COMPRESS )   

PRIMARY INDEX ( SOURCE_KEY );

explain SELECT  

         K289502.EDW_KEY as EV_ID

        ,CASE WHEN SYSLIB.GCFRCheckDecimal16(BALANCE,20,2) = 1 THEN CAST(BALANCE as number(20,2)) END as BALANCE

        ,CASE WHEN SYSLIB.GCFRCheckDecimal16(AMOUNT,20,2) = 1 THEN CAST(AMOUNT as number(20,2)) END as AMOUNT

        ,SYSLIB.DATE_OR_NULL(SUBSTRING(POST_DATE FROM 1 FOR 10),'yyyy-mm-dd') as POST_DATE

        ,SYSLIB.DATE_OR_NULL(SUBSTRING(VALUE_DATE FROM 1 FOR 10),'yyyy-mm-dd') as VALUE_DATE

        ,TRANSACTION_TYPE

        ,SOURCE_CODE

        ,REAL_OR_NOTIONAL_FLAG

        ,TCT_ID

        ,CCY_ID

        ,CASE WHEN SYSLIB.GCFRCheckDecimal16(NOMINAL_INCREASE,20,2) = 1 THEN CAST(NOMINAL_INCREASE as number(20,2)) END as NOMINAL_INCREASE

        ,CASE WHEN SYSLIB.GCFRCheckDecimal16(NOMINAL_DECREASE,20,2) = 1 THEN CAST(NOMINAL_DECREASE as number(20,2))  END as NOMINAL_DECREASE

        ,CASE WHEN SYSLIB.GCFRCheckDecimal16(AMORT_AMOUNT,20,2) = 1 THEN CAST(AMORT_AMOUNT as number(20,2)) END as AMORT_AMOUNT

        ,CASE WHEN SYSLIB.GCFRCheckDecimal8(RPL,18,2) = 1 THEN CAST(RPL as number(18,2)) END as RPL

        ,SYSLIB.DATE_OR_NULL(SUBSTRING(SETTLEMENT_DATE FROM 1 FOR 10),'yyyy-mm-dd') as SETTLEMENT_DATE

        ,CASE WHEN SYSLIB.GCFRCheckDecimal8(EXCHANGE_RATE,18,8) = 1 THEN CAST(EXCHANGE_RATE as number(18,8)) END as EXCHANGE_RATE

        ,POSTING_COUNT

FROM sys_reporting.ST_TRML_DAILY_PRINCIPAL a   JOIN

     sys_reporting.BK03_EV K289502

ON a.source_key = K289502.SOURCE_KEY

and K289502.DOMAIN_ID =12;

  1) First, we lock a distinct sys_reporting."pseudo table" for read on

     a RowHash to prevent global deadlock for sys_reporting.a.

  2) Next, we lock a distinct sys_reporting."pseudo table" for read on

     a RowHash to prevent global deadlock for sys_reporting.K289502.

  3) We lock sys_reporting.a for read, and we lock

     sys_reporting.K289502 for read.

  4) We do an all-AMPs JOIN step from sys_reporting.a by way of a

     RowHash match scan with no residual conditions, which is joined to

     sys_reporting.K289502 by way of a RowHash match scan with a

     condition of ("sys_reporting.K289502.DOMAIN_ID = 12").

     sys_reporting.a and sys_reporting.K289502 are joined using a merge

     join, with a join condition of ("sys_reporting.a.SOURCE_KEY =

     sys_reporting.K289502.SOURCE_KEY").  The input table

     sys_reporting.a will not be cached in memory, but it is eligible

     for synchronized scanning.  The result goes into Spool 1

     (group_amps), which is built locally on the AMPs.  The result

     spool file will not be cached in memory.  The size of Spool 1 is

     estimated with low confidence to be 60,909,944 rows (

     20,526,651,128 bytes).  The estimated time for this step is 1

     minute and 35 seconds.

  5) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 1 are sent back to the user as the result of

     statement 1.  The total estimated time is 1 minute and 35 seconds.

Stats for Sys_reporting.bk03_ev 

15/06/16 09:44:18           61,284,375 * * "*" ?

15/06/16 09:43:14           42,781,469 SOURCE_KEY,DOMAIN_ID SOURCE_KEY,DOMAIN_ID SOURCE_KEY,DOMAIN_ID ?

15/06/16 09:43:44           61,284,375 EDW_KEY EDW_KEY EDW_KEY ?

15/06/16 09:44:16           61,284,375 SOURCE_KEY SOURCE_KEY SOURCE_KEY ?

15/06/16 09:44:18                    6 DOMAIN_ID DOMAIN_ID DOMAIN_ID ?

Stats for Sys_Reporting.ST_TRML_DAILY_PRINCIPAL

15/06/16 09:46:26           63,173,410 * * "*" ?

15/06/16 09:46:26           63,173,410 SOURCE_KEY SOURCE_KEY SOURCE_KEY ?

example of source_key :- 10004767^2011-03-09 00:00:00.000000^14222954

create join index sys_reporting.ST_TRML_DAILY_PRINCIPAL_12_JI

as select    a.edw_key,

                     b.source_key,

                     b.rowid,

                     a.domain_id

       from sys_reporting.bk03_ev a join   

                sys_reporting.st_TRML_DAILY_PRINCIPAL b

       on a.source_key = b.source_key

       and domain_id = 12

       primary index(edw_key);