query tuning

General

query tuning

dear all, below is my query and the explain.

any suggestion on how to tune the query..thx for your time.

-----------------------------------------------------------------------------------------

Explain

SELECT

     t1.Shipment_No                                        

    ,t1.Account_Dt                                         

    ,COALESCE(t2.region_name,'')       region_name         

    ,t1.Owner_Org                                          

    ,t1.End_City                                           

    ,COALESCE(t4.Account_Org_Cd,'')    Account_Org_Cd      

    ,COALESCE(t2.distribute_center,'') distribute_center   

    ,t1.Setl_Weight                                        

    ,t1.Shipping_Method                                    

    ,t3.req_tm                                             

FROM

pmart_st.shipment_setl2 t1

LEFT JOIN

(

    SELECT

         t1.shipment_no

        ,t3.org_cd AS distribute_center

        ,t6.region_name

    FROM

    pmart_st.shipment_setl2 t1

    LEFT JOIN

    (

        SELECT

             account_dt

            ,shipment_no

            ,param_value

        FROM pmart_st.calc_param

        WHERE data_source_cd=03010411

        AND shipment_no='1200529372042'

        AND param_value<>0

    )t2

    ON t1.shipment_no=t2.shipment_no

    AND t1.account_dt=t2.account_dt

    LEFT JOIN pdata.pty_org t3

    ON t2.param_value=t3.Org_Cd

    LEFT JOIN pdata.adr_county_new t4

    ON t3.belong_region_id=t4.county_id

    LEFT JOIN pdata.adr_city_new t5

    ON t4.city_id=t5.city_id

    LEFT JOIN pdata.adr_prov_new t6

    ON t5.prov_id=t6.prov_id

    LEFT JOIN pdata.cde_org_type t7

    ON t3.org_type=t7.org_type_id

    AND t3.start_dt<=t1.account_dt-10

    AND t3.end_dt>t1.account_dt-10

    WHERE t1.shipment_no='1200529372042'

    AND t3.org_type=3

)t2

ON t1.shipment_no=t2.shipment_no

LEFT JOIN

(

    SELECT

         shipment_no

        ,MAX(delv_tm)-MAX(recv_tm) HOUR(4) AS req_tm

    FROM

    (

        SELECT

             shipment_no

            ,MIN(scan_tm) recv_tm

            ,CAST('1900-01-01 01:01:01' AS TIMESTAMP(0)) delv_tm

        FROM

        pdata.pst_scan

        WHERE shipment_no='1200529372042'

        AND scan_type=14

        GROUP BY 1

        UNION

        SELECT

             shipment_no

            ,CAST('1900-01-01 01:01:01' AS TIMESTAMP(0))

            ,MAX(scan_tm)

        FROM

        pdata.pst_scan

        WHERE shipment_no='1200529372042'

        AND scan_type=10

        GROUP BY 1

    )a

    GROUP BY 1

)t3

ON t1.shipment_no=t3.shipment_no

LEFT JOIN

(

    SELECT

         Shipment_No

        ,Account_Org_Cd

    FROM

    PMART_ST.FEE_DETAIL_ORG

    WHERE Fee_Cd=010202

    AND shipment_no='1200529372042'

)t4

ON t1.shipment_no=t4.shipment_no

WHERE t1.shipment_no='1200529372042'

;

  1) First, we lock a distinct pdata."pseudo table" for read on a

     RowHash to prevent global deadlock for pdata.t7.

  2) Next, we lock a distinct pdata."pseudo table" for read on a

     RowHash to prevent global deadlock for pdata.t4.

  3) We lock a distinct pdata."pseudo table" for read on a RowHash to

     prevent global deadlock for pdata.t6.

  4) We lock a distinct pdata."pseudo table" for read on a RowHash to

     prevent global deadlock for pdata.t5.

  5) We lock a distinct pdata."pseudo table" for read on a RowHash to

     prevent global deadlock for pdata.t3.

  6) We lock pdata.t7 for read, we lock pdata.t4 for read, we lock

     pdata.t6 for read, we lock pdata.t5 for read, and we lock pdata.t3

     for read.

  7) We do a single-AMP SUM step to aggregate from all partitions of

     pdata.pst_scan by way of the primary index

     "pdata.pst_scan.Shipment_No = '1200529372042'" with a residual

     condition of ("(pdata.pst_scan.Shipment_No = '1200529372042') AND

     ((pdata.pst_scan.Scan_Type (FLOAT, FORMAT

     '-9.99999999999999E-999'))= 1.40000000000000E 001)")

     , grouping by field1 ( pdata.pst_scan.Shipment_No).  Aggregate

     Intermediate Results are computed locally, then placed in Spool 4.

     The size of Spool 4 is estimated with high confidence to be 1 row

     (39 bytes).  The estimated time for this step is 2.44 seconds.

  8) We execute the following steps in parallel.

       1) We do a single-AMP RETRIEVE step from Spool 4 (Last Use) by

          way of the hash value of "pdata.pst_scan.Shipment_No =

          '1200529372042'" into Spool 1 (all_amps), which is

          redistributed by the hash code of (TIMESTAMP '1900-01-01

          01:01:01', pdata.pst_scan.Scan_Tm, pdata.pst_scan.Shipment_No)

          to all AMPs.  The size of Spool 1 is estimated with high

          confidence to be 1 row (87 bytes).  The estimated time for

          this step is 0.02 seconds.

       2) We do a single-AMP SUM step to aggregate from all partitions

          of pdata.pst_scan by way of the primary index

          "pdata.pst_scan.Shipment_No = '1200529372042'" with a

          residual condition of ("(pdata.pst_scan.Shipment_No =

          '1200529372042') AND ((pdata.pst_scan.Scan_Type (FLOAT,

          FORMAT '-9.99999999999999E-999'))= 1.00000000000000E 001)")

          , grouping by field1 ( pdata.pst_scan.Shipment_No).

          Aggregate Intermediate Results are computed locally, then

          placed in Spool 7.  The size of Spool 7 is estimated with

          high confidence to be 1 row (39 bytes).  The estimated time

          for this step is 2.44 seconds.

  9) We do a single-AMP RETRIEVE step from Spool 7 (Last Use) by way of

     the hash value of "pdata.pst_scan.Shipment_No = '1200529372042'"

     into Spool 1 (all_amps), which is redistributed by the hash code

     of (pdata.pst_scan.Scan_Tm, TIMESTAMP '1900-01-01 01:01:01',

     pdata.pst_scan.Shipment_No) to all AMPs.  Then we do a SORT to

     order Spool 1 by the sort key in spool field1 eliminating

     duplicate rows.  The size of Spool 1 is estimated with low

     confidence to be 1 row (87 bytes).  The estimated time for this

     step is 0.02 seconds.

 10) We do an all-AMPs SUM step to aggregate from Spool 1 (Last Use) by

     way of an all-rows scan with a condition of ("a.SHIPMENT_NO =

     '1200529372042'") , grouping by field1 ( SHIPMENT_NO).  Aggregate

     Intermediate Results are computed globally, then placed in Spool

     10.  The size of Spool 10 is estimated with low confidence to be 1

     row (49 bytes).  The estimated time for this step is 0.04 seconds.

 11) We execute the following steps in parallel.

      1) We do an all-AMPs RETRIEVE step from Spool 10 (Last Use) by

         way of an all-rows scan into Spool 2 (used to materialize

         view, derived table or table function t3) (all_amps), which is

         built locally on the AMPs.  The size of Spool 2 is estimated

         with low confidence to be 1 row (31 bytes).  The estimated

         time for this step is 0.02 seconds.

      2) We do an all-AMPs RETRIEVE step from pdata.t3 by way of an

         all-rows scan with a condition of ("pdata.t3.Org_Type = 3")

         into Spool 13 (all_amps), which is redistributed by the hash

         code of (pdata.t3.Belong_Region_Id) to all AMPs.  Then we do a

         SORT to order Spool 13 by row hash.  The size of Spool 13 is

         estimated with no confidence to be 2,211 rows (86,229 bytes).

         The estimated time for this step is 0.02 seconds.

      3) We do an all-AMPs RETRIEVE step from pdata.t5 by way of an

         all-rows scan with a condition of ("NOT (pdata.t5.City_Id IS

         NULL)") into Spool 14 (all_amps), which is redistributed by

         the hash code of (pdata.t5.Prov_Id) to all AMPs.  Then we do a

         SORT to order Spool 14 by row hash.  The size of Spool 14 is

         estimated with low confidence to be 144 rows (3,600 bytes).

         The estimated time for this step is 0.01 seconds.

      4) We do a single-AMP RETRIEVE step from all partitions of

         pmart_st.t1 by way of the primary index

         "pmart_st.t1.Shipment_No = '1200529372042'" with a residual

         condition of ("(pmart_st.t1.Shipment_No = '1200529372042') AND

         ((NOT (pmart_st.t1.Account_Dt IS NULL )) AND (NOT

         (pmart_st.t1.Shipment_No IS NULL )))") into Spool 15 (one-amp),

         which is built locally on that AMP.  Then we do a SORT to

         partition Spool 15 by rowkey.  The size of Spool 15 is

         estimated with low confidence to be 2 rows (50 bytes).  The

         estimated time for this step is 0.15 seconds.

 12) We execute the following steps in parallel.

      1) We do an all-AMPs JOIN step from Spool 13 (Last Use) by way of

         a RowHash match scan, which is joined to pdata.t4 by way of a

         RowHash match scan.  Spool 13 and pdata.t4 are left outer

         joined using a merge join, with condition(s) used for

         non-matching on left table ("NOT (Belong_Region_Id IS NULL)"),

         with a join condition of ("Belong_Region_Id =

         pdata.t4.County_ID").  The result goes into Spool 16

         (all_amps), which is built locally on the AMPs.  Then we do a

         SORT to order Spool 16 by the hash code of (pdata.t3.Org_Cd

         (FLOAT)).  The size of Spool 16 is estimated with no

         confidence to be 2,948 rows (100,232 bytes).  The estimated

         time for this step is 0.02 seconds.

      2) We do a single-AMP JOIN step from all partitions of

         pmart_st.calc_param by way of the primary index

         "pmart_st.calc_param.Shipment_No = '1200529372042'" with a

         residual condition of ("(pmart_st.calc_param.Data_Source_Cd =

         3010411) AND ((NOT (pmart_st.calc_param.Shipment_No IS NULL ))

         AND ((pmart_st.calc_param.Shipment_No = '1200529372042') AND

         ((pmart_st.calc_param.Param_Value (FLOAT, FORMAT

         '-9.99999999999999E-999'))<> 0.00000000000000E 000 )))"),

         which is joined to Spool 15 (Last Use) by way of the hash

         value of "pmart_st.t1.Shipment_No = '1200529372042'".

         pmart_st.calc_param and Spool 15 are joined using a

         rowkey-based merge join, with a join condition of (

         "(Shipment_No = pmart_st.calc_param.Shipment_No) AND

         (Account_Dt = pmart_st.calc_param.Account_Dt)").  The input

         table pmart_st.calc_param will not be cached in memory.  The

         result goes into Spool 17 (all_amps), which is duplicated on

         all AMPs.  Then we do a SORT to order Spool 17 by the hash

         code of (pmart_st.calc_param.Param_Value (FLOAT, FORMAT

         '-9.99999999999999E-999')(FLOAT)).  The size of Spool 17 is

         estimated with low confidence to be 288 rows (13,248 bytes).

         The estimated time for this step is 0.00 seconds.

 13) We execute the following steps in parallel.

      1) We do an all-AMPs JOIN step from Spool 16 (Last Use) by way of

         a RowHash match scan, which is joined to Spool 17 (Last Use)

         by way of a RowHash match scan.  Spool 16 and Spool 17 are

         joined using a merge join, with a join condition of (

         "(Param_Value (FLOAT, FORMAT '-9.99999999999999E-999'))=

         (Org_Cd)").  The result goes into Spool 18 (all_amps), which

         is redistributed by the hash code of (pdata.t4.City_ID) to all

         AMPs.  Then we do a SORT to order Spool 18 by row hash.  The

         size of Spool 18 is estimated with no confidence to be 218

         rows (11,554 bytes).  The estimated time for this step is 0.03

         seconds.

      2) We do an all-AMPs JOIN step from Spool 14 (Last Use) by way of

         a RowHash match scan, which is joined to pdata.t6 by way of a

         RowHash match scan.  Spool 14 and pdata.t6 are left outer

         joined using a merge join, with condition(s) used for

         non-matching on left table ("NOT (Prov_Id IS NULL)"), with a

         join condition of ("Prov_Id = pdata.t6.Prov_ID").  The result

         goes into Spool 19 (all_amps), which is redistributed by the

         hash code of (pdata.t5.City_Id) to all AMPs.  Then we do a

         SORT to order Spool 19 by row hash.  The size of Spool 19 is

         estimated with index join confidence to be 264 rows (5,808

         bytes).  The estimated time for this step is 0.02 seconds.

      3) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way

         of an all-rows scan with a condition of ("t3.SHIPMENT_NO =

         '1200529372042'") into Spool 20 (all_amps), which is

         redistributed by the hash code of (SHIPMENT_NO) to all AMPs.

         The size of Spool 20 is estimated with low confidence to be 1

         row (23 bytes).  The estimated time for this step is 0.01

         seconds.

      4) We do an all-AMPs RETRIEVE step from pdata.t7 by way of an

         all-rows scan with no residual conditions into Spool 21

         (all_amps), which is duplicated on all AMPs.  Then we do a

         SORT to order Spool 21 by the hash code of (

         pdata.t7.Org_Type_Id (FLOAT, FORMAT

         '-9.99999999999999E-999')(FLOAT)).  The size of Spool 21 is

         estimated with low confidence to be 5,184 rows (82,944 bytes).

         The estimated time for this step is 0.01 seconds.

 14) We execute the following steps in parallel.

      1) We do an all-AMPs JOIN step from Spool 18 (Last Use) by way of

         a RowHash match scan, which is joined to Spool 19 (Last Use)

         by way of a RowHash match scan.  Spool 18 and Spool 19 are

         left outer joined using a merge join, with condition(s) used

         for non-matching on left table ("NOT (City_ID IS NULL)"), with

         a join condition of ("City_ID = City_Id").  The result goes

         into Spool 22 (all_amps), which is built locally on the AMPs.

         Then we do a SORT to order Spool 22 by the hash code of (

         pdata.t3.Org_Type (FLOAT)).  The size of Spool 22 is estimated

         with no confidence to be 533 rows (27,183 bytes).  The

         estimated time for this step is 0.02 seconds.

      2) We do an all-AMPs JOIN step from all partitions of pmart_st.t1

         by way of the primary index "pmart_st.t1.Shipment_No =

         '1200529372042'" with a residual condition of (

         "pmart_st.t1.Shipment_No = '1200529372042'"), which is joined

         to Spool 20 (Last Use) by way of an all-rows scan.

         pmart_st.t1 and Spool 20 are left outer joined using a product

         join, with condition(s) used for non-matching on left table (

         "pmart_st.t1.Shipment_No = '1200529372042'"), with a join

         condition of ("pmart_st.t1.Shipment_No = SHIPMENT_NO").  The

         result goes into Spool 23 (one-amp), which is built locally on

         the AMPs.  Then we do a SORT to order Spool 23 by the hash

         code of (pmart_st.t1.Shipment_No).  The size of Spool 23 is

         estimated with low confidence to be 2 rows (414 bytes).  The

         estimated time for this step is 0.15 seconds.

      3) We do a single-AMP RETRIEVE step from all partitions of

         PMART_ST.FEE_DETAIL_ORG by way of the primary index

         "PMART_ST.FEE_DETAIL_ORG.Shipment_No = '1200529372042'" with a

         residual condition of ("(PMART_ST.FEE_DETAIL_ORG.Fee_Cd =

         10202) AND (PMART_ST.FEE_DETAIL_ORG.Shipment_No =

         '1200529372042')") into Spool 24 (one-amp), which is built

         locally on that AMP.  Then we do a SORT to order Spool 24 by

         the hash code of (PMART_ST.FEE_DETAIL_ORG.Shipment_No).  The

         size of Spool 24 is estimated with low confidence to be 2 rows

         (66 bytes).  The estimated time for this step is 0.19 seconds.

 15) We execute the following steps in parallel.

      1) We do a single-AMP JOIN step from Spool 23 (Last Use) by way

         of the hash value of "pmart_st.t1.Shipment_No =

         '1200529372042'", which is joined to Spool 24 (Last Use) by

         way of the hash value of "PMART_ST.FEE_DETAIL_ORG.Shipment_No

         = '1200529372042'".  Spool 23 and Spool 24 are left outer

         joined using a merge join, with condition(s) used for

         non-matching on left table ("Shipment_No = '1200529372042'"),

         with a join condition of ("Shipment_No = Shipment_No").  The

         result goes into Spool 25 (one-amp), which is built locally on

         that AMP.  The size of Spool 25 is estimated with low

         confidence to be 4 rows (876 bytes).  The estimated time for

         this step is 0.02 seconds.

      2) We do an all-AMPs JOIN step from Spool 21 (Last Use) by way of

         a RowHash match scan, which is joined to Spool 22 (Last Use)

         by way of a RowHash match scan.  Spool 21 and Spool 22 are

         right outer joined using a merge join, with condition(s) used

         for non-matching on right table ("(End_Dt > ((Account_Dt )- 10

         )) AND (Start_Dt <= ((Account_Dt )- 10 ))"), with a join

         condition of ("(Org_Type )= (Org_Type_Id (FLOAT, FORMAT

         '-9.99999999999999E-999'))").  The result goes into Spool 28

         (all_amps), which is duplicated on all AMPs.  The size of

         Spool 28 is estimated with no confidence to be 51,192 rows (

         1,842,912 bytes).  The estimated time for this step is 0.04

         seconds.

 16) We do an all-AMPs JOIN step from Spool 25 (Last Use) by way of the

     hash value of "pmart_st.t1.Shipment_No = '1200529372042'", which

     is joined to Spool 28 (Last Use) by way of an all-rows scan.

     Spool 25 and Spool 28 are left outer joined using a product join,

     with condition(s) used for non-matching on left table (

     "Shipment_No = '1200529372042'"), with a join condition of (

     "(Shipment_No = Shipment_No) AND ((Shipment_No = Shipment_No) AND

     (Shipment_No = Shipment_No ))").  The result goes into Spool 12

     (all_amps), which is built locally on the AMPs.  The size of Spool

     12 is estimated with no confidence to be 2,844 rows (676,872

     bytes).  The estimated time for this step is 0.04 seconds.

 17) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 12 are sent back to the user as the result

     of statement 1.  The total estimated time is 5.40 seconds.

Tags (2)
3 REPLIES
Enthusiast

Re: query tuning

Hi,

How much time is it taking?

Also please check the datatype of the tables , it seems that teradata need to convert the values to float before joining . Datatype conversion also take times if we have huge amount of data ( although in this case you have only 1 shipment_no)

 (

         pdata.t7.Org_Type_Id (FLOAT, FORMAT

         '-9.99999999999999E-999')(FLOAT)). 

  ((pmart_st.calc_param.Param_Value (FLOAT, FORMAT

         '-9.99999999999999E-999'))<> 0.00000000000000E 000 )))"),

 Spool 16 and Spool 17 are

         joined using a merge join, with a join condition of (

         "(Param_Value (FLOAT, FORMAT '-9.99999999999999E-999'))=

         (Org_Cd)"). 

 ((pdata.pst_scan.Scan_Type (FLOAT,

          FORMAT '-9.99999999999999E-999'))= 1.00000000000000E 001)")

Enthusiast

Re: query tuning

I can see a lot of 'low confidence' and 'no confidence' clauses in the EXPLAIN PLAN... Have the STATS been collected on the joining column and index columns?

Senior Apprentice

Re: query tuning

As already mentioned, Scan_Type/Org_Cd/Param_Value/Org_Type_Id seem to be CHARs, but you compare them to numeric values, resulting in unneccessary type casts. And those type cast might be another reason why the confidence level is low, they prevent the evaluation of existing statistics.

According to explain the steps consuming most of the estimated time are 7 and 8.2, both doing a Single AMP PI access, but using "all partitions", which could be quite slow if the number of partitions is large.

And the t3 Derived Table can probably be simplified (untested):

        SELECT
shipment_no
,coalesce(MIN(case when scan_type=10 then scan_tm end), timestamp '1900-01-01 01:01:01')
- coalesce(MIN(case when scan_type=14 then scan_tm end), timestamp '1900-01-01 01:01:01') HOUR(4) AS req_tm
FROM
pdata.pst_scan
WHERE shipment_no='1200529372042'
GROUP BY 1

Btw, this calculation will fail if only one of the timestamps defaults to '1900-01-01 01:01:01', because HOUR(4) will overflow.

To help further you need to post more info: PI/Partitioning and statistics.

Dieter