Running slloww

Database
Enthusiast

Running slloww

Hi !

Thsi qry is takinng too much cpu .Especially this DOUBLE PRECISION is what i am concerened. Is that really needed?
Also I can see that tableA is used again as a subquery but just the subqry alone runs in 2 min. What can be done to improve the perf other than collectinf stats?
Thanks much!

SELECT DISTINCT D1.c2 AS c1 , D1.c3 AS c2 , D1.c4 AS c3 , D1.c5 AS c4 , D1.c1 AS c5
FROM ( SELECT SUM ( B.CRNT_AMT ) AS c1 , A.PIN_NO AS c2 , A.FL_NM AS c3 , A.CNTRT_NBR AS c4 , A.ST_CD AS c5
FROM TABLEA A ,
TABLED D ,
TABLEB B ,
TABLEC C
WHERE( B.TABLEA_ID = A.TABLEA_ID
AND B.TABLEC_ID = C.TABLEC_ID
AND D.DT = CAST( dAtE'2011-06-30' AS TIMESTAMP )
AND CAST( B.MRKT_DT AS DATE ) = CAST( D.DT AS DATE )
AND ( C.SRC_CD = 'A0' OR C.SRC_CD = 'A1' )
AND A.PIN_NO IN (
SELECT DISTINCT A.PIN_NO AS c1
FROM TABLEA A ,
TRNS_REA E
WHERE ( A.END_DT = dAtE'2100-12-31'
AND A.PRD_ID = E.ARNGM_NO
AND ( CAST( ( ( CAST( E.TRNS_DT AS TIMESTAMP ) - CAST( CAST( dAtE'2011-06-30' AS DATE ) AS TIMESTAMP ) ) MONTH( 4 ) ) AS INT ) ) = 0
AND ( E.TRNS_CD = 'AAB' OR E.TRNS_CD = 'AAC' OR E.TRNS_CD = 'AAD'
OR E.TRNS_CD = 'AAE' OR E.TRNS_CD = 'TAAF' ) ) ) )
GROUP BY A.FL_NM , A.ST_CD , A.CNTRT_NBR , A.PIN_NO
) D1
WHERE ( ( 150000.0 ( DOUBLE PRECISION ) ) - ( 50000.0 ( DOUBLE PRECISION ) ) < D1.c1 )
ORDER BY 1 , 2 , 3 , 4 ;

1) First, we lock .TABLE for access, we lock
.TRNS for access, we lock .TABLEA for access,
we lock .TABLEG for access, we lock
.TABLEC for access, we lock .TABLEB for
access, and we lock TP001.TABLED for access.
2) NAAD, we execute the following steps in AAFrallel.
1) We do an all-AMPs RETRIEVE step from .TABLE
by way of an all-rows scan with a condition of (
"((.TABLE.TRNS_CD = 'AAB') OR
((.TABLE.TRNS_CD = 'AAC') OR
((.TABLE.TRNS_CD = 'AAD') OR
((.TABLE.TRNS_CD = 'AAE') OR
(.TABLE.TRNS_CD = 'TAAF'))))) AND
((.TABLE.END_DT = DATE '2100-12-31') AND
((.TABLE.APP_CD = 'REA') AND ((NOT
(.TABLE.STD_5_CD IS NULL )) AND ((NOT
(.TABLE.STD_4_CD IS NULL )) AND ((NOT
(.TABLE.STD_3_CD IS NULL )) AND ((NOT
(.TABLE.STD_2_CD IS NULL )) AND (NOT
(.TABLE.STD_1_CD IS NULL ))))))))")
into Spool 5 (all_amps), which is duplicated on all AMPs.
The size of Spool 5 is estimated with low confidence to be
9,300 rows. The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from .TRNS by way of an
all-rows scan with a condition of ("(NOT
(.TRNS.ARNGM_NO IS NULL )) AND (((AADRACT(MONTH
FROM ((((CAST((TRNS_DT) AS TIMESTAMP(6))) - (CAST((DATE
'2011-06-30') AS TIMESTAMP(6)))) MONTH(4))))(INTEGER))= 0)
AND ((.TRNS.TRNS_DT >= DATE '2010-09-01') AND ((NOT
(.TRNS.STD_5_CD IS NULL )) AND ((NOT
(.TRNS.STD_4_CD IS NULL )) AND ((NOT
(.TRNS.STD_3_CD IS NULL )) AND ((NOT
(.TRNS.STD_2_CD IS NULL )) AND (NOT
(.TRNS.STD_1_CD IS NULL ))))))))") into Spool 6
(all_amps), which is built locally on the AMPs. The input
table will not be cached in memory, but it is eligible for
synchronized scanning. The size of Spool 6 is estimated with
no confidence to be 81,828,093 rows. The estimated time for
this step is 2 minutes and 22 seconds.
3) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of an
all-rows scan, which is joined to Spool 6 (Last Use) by way of an
all-rows scan. Spool 5 and Spool 6 are joined using a single
AAFrtition hash join, with a join condition of ("(STD_5_CD =
STD_5_CD) AND ((STD_4_CD = STD_4_CD) AND
((STD_3_CD = STD_3_CD) AND ((STD_2_CD = STD_2_CD)
AND (((SUBSTR(TRIM(BOTH FROM {RightTable}.TRNS_ACTVTY_CD ),1 ,3
))= TRNS_ACTVTY_CD) AND (STD_1_CD= STD_1_CD )))))"). The
result goes into Spool 7 (all_amps), which is redistributed by
hash code to all AMPs. Then we do a SORT to order Spool 7 by row
hash. The size of Spool 7 is estimated with no confidence to be
19 rows. The estimated time for this step is 0.17 seconds.
4) We do an all-AMPs JOIN step from Spool 7 (Last Use) by way of a
RowHash match scan, which is joined to .TABLEG by way
of a RowHash match scan with a condition of (
"((.TABLEG.SRC_CD = 'A0') OR
(.TABLEG.SRC_CD = 'A1')) AND
(.TABLEG.END_DT = DATE '2100-12-31')"). Spool 7 and
.TABLEG are joined using a merge join, with a join
condition of ("(END_DT = .TABLEG.END_DT) AND
(TABLEG_ANCH_ID = .TABLEG.TABLEG_ANCH_ID)").
The result goes into Spool 8 (all_amps), which is duplicated on
all AMPs. The size of Spool 8 is estimated with no confidence to
be 2,850 rows. The estimated time for this step is 0.02 seconds.
5) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an
all-rows scan, which is joined to .TABLEA by way
of an all-rows scan with a condition of (
".TABLEA.END_DT = DATE '2100-12-31'"). Spool 8
and .TABLEA are joined using a product join, with
a join condition of ("(.TABLEA.END_DT = END_DT)
AND ((.TABLEA.PRD_ID = ARNGM_NO)
AND (.TABLEA.END_DT = END_DT ))"). The input
table .TABLEA will not be cached in memory, but
it is eligible for synchronized scanning. The result goes into
Spool 4 (all_amps), which is redistributed by hash code to all
AMPs. Then we do a SORT to order Spool 4 by row hash and the sort
key in spool field1 eliminating duplicate rows. The size of Spool
4 is estimated with no confidence to be 19 rows. The estimated
time for this step is 53.76 seconds.
6) We execute the following steps in AAFrallel.
1) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by
way of an all-rows scan into Spool 9 (all_amps), which is
duplicated on all AMPs. The size of Spool 9 is estimated
with no confidence to be 2,850 rows. The estimated time for
this step is 0.01 seconds.
2) We do an all-AMPs RETRIEVE step from TP001.TABLED by way of
an all-rows scan with a condition of ("TP001.TABLED.DT =
TIMESTAMP '2011-06-30 00:00:00.000000'") into Spool 10
(all_amps), which is duplicated on all AMPs. The size of
Spool 10 is estimated with high confidence to be 150 rows.
The estimated time for this step is 0.01 seconds.
7) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of an
all-rows scan, which is joined to .TABLEA by way
of an all-rows scan with no residual conditions locking
.TABLEA for access. Spool 9 and
.TABLEA are joined using a product join, with a
join condition of (".TABLEA.PIN_NO = PIN_NO").
The input table .TABLEA will not be cached in
memory, but it is eligible for synchronized scanning. The result
goes into Spool 11 (all_amps), which is built locally on the AMPs.
The size of Spool 11 is estimated with no confidence to be 200,965
rows. The estimated time for this step is 1 minute and 5 seconds.
8) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of an
all-rows scan, which is joined to Spool 11 (Last Use) by way of an
all-rows scan. Spool 10 and Spool 11 are joined using a product
join, with a join condition of ("(1=1)"). The result goes into
Spool 12 (all_amps), which is built locally on the AMPs. Then we
do a SORT to AAFrtition Spool 12 by rowkey. The size of Spool 12
is estimated with no confidence to be 200,965 rows. The estimated
time for this step is 0.03 seconds.
9) We do an all-AMPs JOIN step from Spool 12 (Last Use) by way of a
RowHash match scan, which is joined to .TABLEB by way
of a RowHash match scan with no residual conditions. Spool 12 and
.TABLEB are joined using a rowkey-based sliding-window
merge join (contAADs = 15, 1), with a join condition of (
"(.TABLEB.TABLEA_ID = TABLEA_ID)
AND (.TABLEB.MRKT_DT = (CAST(({LeftTable}.DT)
AS DATE)))"). The input table .TABLEB will not be
cached in memory. The result goes into Spool 13 (all_amps), which
is redistributed by hash code to all AMPs. Then we do a SORT to
order Spool 13 by row hash. The size of Spool 13 is estimated
with no confidence to be 1,081,308 rows. The estimated time for
this step is 4.05 seconds.
10) We do an all-AMPs JOIN step from Spool 13 (Last Use) by way of a
RowHash match scan, which is joined to .TABLEC by
way of a RowHash match scan with a condition of (
"(.TABLEC.SRC_CD = 'A0') OR
(.TABLEC.SRC_CD = 'A1')"). Spool 13 and
.TABLEC are joined using a merge join, with a join
condition of ("TABLEC_ID =
.TABLEC.TABLEC_ID"). The result goes
into Spool 3 (all_amps), which is built locally on the AMPs. The
size of Spool 3 is estimated with no confidence to be 333,511 rows.
The estimated time for this step is 1.27 seconds.
11) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 14. The size of Spool 14 is estimated with no confidence
to be 250,134 rows. The estimated time for this step is 0.22
seconds.
12) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by way of
an all-rows scan with a condition of ("(Field_6 )>
1.00000000000000E 005") into Spool 1 (group_amps), which is
redistributed by hash code to all AMPs. Then we do a SORT to
order Spool 1 by the sort key in spool field1 eliminating
duplicate rows. The size of Spool 1 is estimated with no
confidence to be 250,134 rows. The estimated time for this step
is 0.16 seconds.
13) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 4 minutes and 26 seconds.
BEGIN RECOMMENDED STATS ->
14) "COLLECT STATISTICS .TABLE COLUMN TRNS_CD".
(HighConf)
15) "COLLECT STATISTICS .TABLE COLUMN (STD_4_CD
,
STD_2_CD ,STD_1_CD ,STD_3_CD ,STD_5_CD)".
(HighConf)
16) "COLLECT STATISTICS .TABLE COLUMN (APP_CD
,
END_DT ,TRNS_CD ,STD_4_CD ,STD_2_CD ,TRNS_ACTVTY_CD
,
STD_1_CD ,STD_3_CD ,STD_5_CD)". (HighConf)
17) "COLLECT STATISTICS .TRNS COLUMN (STD_4_CD ,STD_2_CD
,
STD_1_CD ,STD_3_CD ,STD_5_CD)". (HighConf)
18) "COLLECT STATISTICS .TRNS COLUMN (ARNGM_NO
,TABLEG_ANCH_ID ,
STD_4_CD ,STD_2_CD ,STD_1_CD
,STD_3_CD ,STD_5_CD)". (HighConf)
19) "COLLECT STATISTICS .TRNS COLUMN (ARNGM_NO
,TABLEG_ANCH_ID)". (HighConf)
20) "COLLECT STATISTICS .TRNS COLUMN (ARNGM_NO
,STD_5_CD ,
STD_3_CD ,STD_1_CD ,STD_2_CD
,STD_4_CD)". (HighConf)
21) "COLLECT STATISTICS .TRNS COLUMN (TABLEG_ANCH_ID
,
STD_5_CD ,STD_3_CD ,STD_1_CD ,STD_2_CD
,STD_4_CD)". (HighConf)
22) "COLLECT STATISTICS .TABLEG COLUMN (SRC_CD
,
END_DT ,TABLEG_ANCH_ID)". (HighConf)
23) "COLLECT STATISTICS .TABLEA COLUMN
(PRD_ID ,
END_DT)". (HighConf)
24) "COLLECT STATISTICS .TABLEA COLUMN (PIN_NO)".
(HighConf)
25) "COLLECT STATISTICS .TABLEB COLUMN (MRKT_DT
,TABLEA_ID)". (HighConf)
26) "COLLECT STATISTICS .TABLEB COLUMN
(TABLEC_ID ,
TABLEA_ID)". (HighConf)
<- END RECOMMENDED STATS

1 REPLY
Senior Apprentice

Re: Running slloww

Is there a competition at your site?
"Write the most obfuscated query"

SELECT DISTINCT D1.c2 AS c1 , D1.c3 AS c2 , D1.c4 AS c3 , D1.c5 AS c4 , D1.c1 AS c5
DISTINCT is not needed as the Derived Table is already unique due to the GROUP BY.

This outermost SELECT is not needed at all:
WHERE ( ( 150000.0 ( DOUBLE PRECISION ) ) - ( 50000.0 ( DOUBLE PRECISION ) ) < D1.c1 )
should be rewritten as a HAVING

DOUBLE PRECISION is the same as FLOAT, don't be concerned about that.
But why not using "10000 (FLOAT) < D1.c1" instead of that calculation?
Or just a simple "D1.c1 > 10000"?

AND ( CAST( ( ( CAST( E.TRNS_DT AS TIMESTAMP ) - CAST( CAST( dAtE'2011-06-30' AS DATE ) AS TIMESTAMP ) ) MONTH( 4 ) ) AS INT ) ) = 0
This is simply looking for rows from june 2011:
E.TRNS_DT between dAtE'2011-06-01' and dAtE'2011-06-30'
or
E.TRNS_DT between dAtE'2011-06-01' and add_months(dAtE'2011-06-01',1)-1

Now it's a SARG (Searchable ARGument) and the optimizer can use stats and indexes.

I think most of your performance problems are not only due to bad SQL, but also due to a bad data model/physical implementation, e.g.:

AND D.DT = CAST( dAtE'2011-06-30' AS TIMESTAMP )
AND CAST( B.MRKT_DT AS DATE ) = CAST( D.DT AS DATE )
Why do you mix data types (DATE and TIMESTAMP), if you don't care about the time part why don't you use DATE instead. This is not Oracle where.

You don't have to join to TableD as there's no other column of D used and above condition is equal to:
AND CAST( B.MRKT_DT AS DATE ) = dAtE '2011-06-30'

Dieter