code check

Database
l_k
Enthusiast

code check

Hi All,

This query is taking more than 3 minutes .query is given below

SELECT RANK(dt1.Source_Value) + dt2.Max_Key (INTEGER) AS Ranked_Id ,
'-'|| CAST(Ranked_Id AS CHAR(10)) (CHAR(10)) AS EDW_Value ,
0450 (INTEGER) AS Source_Number ,
dt1.Source_Value (VARCHAR(50))
AS Source_Value FROM

(SELECT DISTINCT TRIM(BOTH FROM XVB0513.Source_Value)
AS Source_Value FROM DP_WEDW.XV_B0513_S0450_01_D_01 XVB0513
WHERE TRIM(Source_Value)
NOT IN
(SELECT TRIM(Source_Value) FROM DP_TEDW.B0513_MAP
WHERE Source_Value = TRIM(XVB0513.Source_Value)
AND Source_Number = 0450) AND TRIM(XVB0513.Source_Value)<>''
) dt1
CROSS JOIN
(SELECT COALESCE(MAX(Ranked_Id),1) AS Max_Key
FROM DP_TEDW.B0513_MAP) dt2 ;

i am expecting suggestion for rewriting the above query
Tags (1)
6 REPLIES
Enthusiast

Re: code check

I'll start with the basic questions first.

1. Have you evaluated the explain plan for anything suspicious like a product join? I'd expect to see at least one given your cross join condition.
2. Are your statistics up to date for all join/filter columns?
3. Have you looked at DBQL to determine exactly which steps are taking the longest?

Outside of that, you could look at the DISTINCT operator. Depending on your version of Teradata, I believe there is a performance difference between DISTINCT and GROUP BY that can be significant. If the table you're looking at is large, I'd look at using a GROUP BY.

Another thing to look at may be your NOT IN. I've had good luck in the past using NOT EXISTS logic instead. Based on what you're doing, you should be able to get that to work.

Also, extensive use of character manipulation functions (TRIM, concatenation, etc.) can degrade performance as statistics are no longer applicable when you're changing the values.

Without seeing the explain, there's not much we can do to help you.
l_k
Enthusiast

Re: code check

Hi,
We are using TD12 version...
2.Collect stats. is upto date
3. Can you please help me out how to see the DBQL steps to determine the longest step (in DBQLsteptbl ,i see the steps but not in text format like explain steps,pls provide the qry for this)?

I have given the explain plan below:

1) First, we lock a distinct DP_WEDW."pseudo table" for read on a
RowHash to prevent global deadlock for DP_WEDW.XP_CalendarDate.
2) Next, we lock DP_SEDW.S045003_EDW_MAP_REP for access, we lock
DP_WEDW.XP_CalendarDate for read, and we lock DP_TEDW.B0513_MAP
for access.
3) We do an all-AMPs RETRIEVE step from DP_WEDW.XP_CalendarDate by
way of an all-rows scan with a condition of (
"(DP_WEDW.XP_CalendarDate.CalendarDate <= DATE '2011-02-11') AND
(DP_WEDW.XP_CalendarDate.CalendarDate >= DATE '2011-02-03')") into
Spool 5 (all_amps), which is built locally on the AMPs. The size
of Spool 5 is estimated with high confidence to be 9 rows (171
bytes). The estimated time for this step is 0.01 seconds.
4) We do an all-AMPs STAT FUNCTION step from Spool 5 (Last Use) by
way of an all-rows scan into Spool 8 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 6 (all_amps), which is built locally on the AMPs. The
size is estimated with high confidence to be 9 rows (225 bytes).
5) We do an all-AMPs RETRIEVE step from Spool 6 (Last Use) by way of
an all-rows scan into Spool 1 (used to materialize view, derived
table or table function REP) (all_amps), which is built locally on
the AMPs. The size of Spool 1 is estimated with high confidence
to be 9 rows (225 bytes). The estimated time for this step is
0.01 seconds.
6) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
an all-rows scan into Spool 11 (all_amps), which is duplicated on
all AMPs. The size of Spool 11 is estimated with high confidence
to be 1,575 rows (26,775 bytes). The estimated time for this step
is 0.01 seconds.
7) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 11 (Last Use) by way
of an all-rows scan, which is joined to DP_SEDW.S045003 by
way of an all-rows scan with no residual conditions. Spool
11 and DP_SEDW.S045003 are joined using a product join, with
a join condition of ("(TRIM(BOTH FROM (CASE WHEN
({LeftTable}.NBR = 1) THEN ({RightTable}.M_C_CUR_PL) WHEN
({LeftTable}.NBR = 2) THEN ({RightTable}.M_PLIRDCUR1) WHEN
({LeftTable}.NBR = 3) THEN ({RightTable}.M_PLIRDCUR2) WHEN
({LeftTable}.NBR = 4) THEN ({RightTable}.M_TP_FXBASE) WHEN
({LeftTable}.NBR = 5) THEN ({RightTable}.M_TP_IPAYCUR) WHEN
({LeftTable}.NBR = 6) THEN ({RightTable}.M_TP_NOMCUR) WHEN
({LeftTable}.NBR = 7) THEN ({RightTable}.M_TP_RTCUR0) WHEN
({LeftTable}.NBR = 8) THEN ({RightTable}.M_TP_RTCUR1) ELSE
({RightTable}.M_TP_SECCUR) END )))<> ''"). The result goes
into Spool 2 (used to materialize view, derived table or
table function XVB0513) (all_amps), which is redistributed by
the hash code of ((CASE WHEN ({LeftTable}.NBR = 1) THEN
({RightTable}.M_C_CUR_PL) WHEN ({LeftTable}.NBR = 2) THEN
({RightTable}.M_PLIRDCUR1) WHEN ({LeftTable}.NBR = 3) THEN
({RightTable}.M_PLIRDCUR2) WHEN ({LeftTable}.NBR = 4) THEN
({RightTable}.M_TP_FXBASE) WHEN ({LeftTable}.NBR = 5) THEN
({RightTable}.M_TP_IPAYCUR) WHEN ({LeftTable}.NBR = 6) THEN
({RightTable}.M_TP_NOMCUR) WHEN ({LeftTable}.NBR = 7) THEN
(DP_SEDW.S045003.M_TP_RTCUR0) WHEN (Rank()= 8) THEN
(DP_SEDW.S045003.M_TP_RTCUR1) ELSE
(DP_SEDW.S045003.M_TP_SECCUR) END)) to all AMPs. Then we do
a SORT to order Spool 2 by the sort key in spool field1
eliminating duplicate rows. The size of Spool 2 is estimated
with no confidence to be 5,353,146 rows (444,311,118 bytes).
The estimated time for this step is 3.54 seconds.
2) We do an all-AMPs RETRIEVE step from DP_TEDW.B0513_MAP by way
of an all-rows scan with a condition of (
"DP_TEDW.B0513_MAP.Source_Number = 450") into Spool 18
(all_amps), which is built locally on the AMPs. Then we do a
SORT to order Spool 18 by the sort key in spool field1
eliminating duplicate rows. The size of Spool 18 is
estimated with high confidence to be 54 rows (11,070 bytes).
The estimated time for this step is 0.01 seconds.
8) We do an all-AMPs SUM step to aggregate from Spool 2 by way of an
all-rows scan with a condition of ("(TRIM(BOTH FROM
XVB0513.SOURCE_VALUE ))<> ''"). Aggregate Intermediate Results
are computed globally, then placed in Spool 14.
9) We do an all-AMPs SUM step to aggregate from Spool 18 by way of an
all-rows scan. Aggregate Intermediate Results are computed
globally, then placed in Spool 19.
10) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 19 (Last Use) by
way of an all-rows scan into Spool 12 (all_amps), which is
duplicated on all AMPs.
2) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by
way of an all-rows scan into Spool 13 (all_amps), which is
duplicated on all AMPs.
11) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 2 by way of an
all-rows scan with a condition of ("(TRIM(BOTH FROM
XVB0513.SOURCE_VALUE ))<> ''") into Spool 16 (all_amps), which
is built locally on the AMPs. Then we do a SORT to order
Spool 16 by row hash, and null value information in Spool 13
and Spool 12. Skip this retrieve step if null exists. The
size of Spool 16 is estimated with no confidence to be
5,353,146 rows (171,300,672 bytes). The estimated time for
this step is 0.22 seconds.
2) We do an all-AMPs RETRIEVE step from Spool 18 by way of an
all-rows scan into Spool 17 (all_amps), which is duplicated on
all AMPs. Then we do a SORT to order Spool 17 by row hash,
and null value information in Spool 13 and Spool 12. Skip
this retrieve step if null exists. The size of Spool 17 is
estimated with high confidence to be 9,450 rows (1,946,700
bytes).
12) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 16 (Last Use) by way of
an all-rows scan, which is joined to Spool 17 (Last Use) by
way of an all-rows scan. Spool 16 and Spool 17 are joined
using an exclusion merge join, with a join condition of ("NOT
(TRIM(BOTH FROM {LeftTable}.SOURCE_VALUE )IS NULL)"), and an
additional join condition of ("((TRIM(BOTH FROM
{LeftTable}.SOURCE_VALUE ))= Source_Value) AND ((TRIM(BOTH
FROM {LeftTable}.SOURCE_VALUE ))= Field_2)"), and null value
information in Spool 13 and Spool 12. Skip this join step if
null exists. The result goes into Spool 4 (used to
materialize view, derived table or table function dt1)
(all_amps), which is redistributed by hash code to all AMPs.
Then we do a SORT to order Spool 4 by the sort key in spool
field1 eliminating duplicate rows. The size of Spool 4 is
estimated with no confidence to be 5,353,146 rows (
444,311,118 bytes). The estimated time for this step is 0.78
seconds.
2) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way
of an all-rows scan with a condition of ("(TRIM(BOTH FROM
XVB0513.SOURCE_VALUE ))<> ''") into Spool 21 (all_amps), which
is redistributed by hash code to all AMPs. Then we do a SORT
to order Spool 21 by row hash, and null value information in
Spool 13 and Spool 12. Skip this retrieve step if there is no
null. The size of Spool 21 is estimated with no confidence to
be 5,353,146 rows (171,300,672 bytes). The estimated time for
this step is 0.54 seconds.
3) We do an all-AMPs RETRIEVE step from Spool 18 (Last Use) by
way of an all-rows scan into Spool 22 (all_amps), which is
built locally on the AMPs. Then we do a SORT to order Spool
22 by row hash, and null value information in Spool 13 and
Spool 12. Skip this retrieve step if there is no null. The
size of Spool 22 is estimated with high confidence to be 54
rows (11,070 bytes).
13) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 21 (Last Use) by way of
an all-rows scan, which is joined to Spool 22 (Last Use) by
way of an all-rows scan. Spool 21 and Spool 22 are joined
using an exclusion merge join, with a join condition of ("NOT
(TRIM(BOTH FROM {LeftTable}.SOURCE_VALUE )IS NULL)"), and an
additional join condition of ("((TRIM(BOTH FROM
{LeftTable}.SOURCE_VALUE ))= Source_Value) AND ((TRIM(BOTH
FROM {LeftTable}.SOURCE_VALUE ))= Field_2)"), and null value
information in Spool 13 (Last Use) and Spool 12 (Last Use).
Skip this join step if there is no null. The result goes into
Spool 4 (used to materialize view, derived table or table
function dt1) (all_amps), which is redistributed by hash code
to all AMPs. Then we do a SORT to order Spool 4 by the sort
key in spool field1 eliminating duplicate rows. The size of
Spool 4 is estimated with no confidence to be 54 rows (4,482
bytes). The estimated time for this step is 1.04 seconds.
2) We do an all-AMPs SUM step to aggregate from DP_TEDW.B0513_MAP
by way of an all-rows scan with no residual conditions.
Aggregate Intermediate Results are computed globally, then
placed in Spool 24. The size of Spool 24 is estimated with
high confidence to be 1 row (19 bytes). The estimated time
for this step is 0.01 seconds.
14) We do an all-AMPs RETRIEVE step from Spool 24 (Last Use) by way of
an all-rows scan into Spool 3 (used to materialize view, derived
table or table function dt2) (all_amps), which is built locally on
the AMPs. The size of Spool 3 is estimated with high confidence
to be 1 row (25 bytes). The estimated time for this step is 0.00
seconds.
15) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 28 (all_amps), which is duplicated on
all AMPs. The size of Spool 28 is estimated with high confidence
to be 175 rows (2,975 bytes). The estimated time for this step is
0.01 seconds.
16) We do an all-AMPs JOIN step from Spool 28 (Last Use) by way of an
all-rows scan, which is joined to Spool 4 (Last Use) by way of an
all-rows scan. Spool 28 and Spool 4 are joined using a product
join, with a join condition of ("(1=1)"). The result goes into
Spool 27 (all_amps), which is built locally on the AMPs. The size
of Spool 27 is estimated with no confidence to be 54 rows (1,998
bytes). The estimated time for this step is 0.01 seconds.
17) We do an all-AMPs STAT FUNCTION step from Spool 27 (Last Use) by
way of an all-rows scan into Spool 31 (Last Use), which is assumed
to be redistributed by value to all AMPs. The result rows are put
into Spool 26 (group_amps), which is built locally on the AMPs.
The size is estimated with no confidence to be 54 rows (5,454
bytes).
18) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 26 are sent back to the user as the result
of statement 1.

Thanks.
Enthusiast

Re: code check

For step-level information, the step number corresponds to the step number in the explain plan. Take a look at the DBQLExplainTbl to see the explain plan as it was generated at execution time and compare the steps there to what's in StepInfoTbl. You can look at start/end time, CPU, IO, etc.
l_k
Enthusiast

Re: code check

Hi,

The cross join step is taking 47 seconds..
SELECT COALESCE(MAX(Ranked_Id),1) AS Max_Key
FROM DP_TEDW.B0513_MAP

Explain plan:
Explanation
1) First, we lock DP_TEDW.B0500_KEY_EVENT in view
DP_VEDW.V0500_KEY_EVENT for access.
2) Next, we do an all-AMPs SUM step to aggregate from
DP_TEDW.B0500_KEY_EVENT in view DP_VEDW.V0500_KEY_EVENT by way of
an all-rows scan with no residual conditions. Aggregate
Intermediate Results are computed globally, then placed in Spool 3.
The input table will not be cached in memory, but it is eligible
for synchronized scanning. The size of Spool 3 is estimated with
high confidence to be 1 row (23 bytes). The estimated time for
this step is 47.27 seconds.
3) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with high
confidence to be 1 row (29 bytes). The estimated time for this
step is 0.00 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 47.28 seconds.

The table DP_TEDW.B0513_MAP having 200 GB of data..is there any option to find out the max(ranked_id) in a faster way? (Ranked_id is not an index column and also not compressed)

Suggestions would be appreciated.
Enthusiast

Re: code check

well i m not sure if there si a faster way to find out the max,other than using max function,but as suggested above try replacing disitinct with group by and remove trim clauses in the where clauses,as I suppose it doesnt make use of stats present,see if it reduces the estimated time in explain plan...
Enthusiast

Re: code check

Adding compression to your table would certainly reduce the time for a full table scan. Alternatively, you could look at placing a secondary index on Ranked_ID, but you'll need to verify that it helps by looking for it in the explain plan. Ideally, you'd get the optimizer to scan the index not the table. Depending on how ETL interacts with this table, a Single-table Join Index with that column only could also reduce the scan time.

Again, with all of these indexing options you'll want to look at your explain plan before and after to make sure the optimizer is using them. If not, drop them because then you're getting all of the overhead with none of the benefit.