Teradata Perfomance

Database
Enthusiast

Teradata Perfomance

Hi guys,

This looks to me like a strange one!

I have a query below:

UPDATE Table1
SET col1 =
CASE WHEN (col2 = 'Y' AND col3 IN ('DQ','FQ','NC','NB','NA'))
OR col3 IN ('RN','RQ','PR')
THEN countRows
ELSE 0
END
WHERE Table1.col4= 'H'
AND NOT
(
Table1.col2 = 'N'
AND Table1.col3 IN ('NC','NB','NA')
)
AND NOT EXISTS
(
SELECT col1
FROM tmpTable1
WHERE col1 = Table1.col5
);

The tmpTable1 never gets populated so it was decided to remove the NOT EXISTS piece of code. As a result, the time taken to run this update has increased by 20 minutes.

Can anyone help me in how this may be. I have jigged around with statistics to no avail. Here is a copy of the execution plans with the only real difference being that one uses a merge update and the other an all-amps update.

Explanation (with NOT EXISTS)

1) First, we lock a distinct "pseudo table" for read on a
RowHash to prevent global deadlock for tmpTable1.
2) Next, we lock a distinct "pseudo table" for write on a
RowHash to prevent global deadlock for Table1.
3) We lock tmpTable1 for read, and we lock Table1 for write.
4) We do an all-AMPs JOIN step from Table1
by way of an all-rows scan with a condition of (
"((Table1.col2 <> 'N') OR
((Table1.col3 <> 'NC ') AND
((Table1.col3 <> 'NB ') AND
(Table1.col3<> 'NA ')))) AND
(Table1.col4= 'H')"), which is joined to tmpTable1 by way
of an all-rows scan with no residual conditions.
Table1 and tmpTable1 are joined using an exclusion
merge join, with a join condition of (
"tmpTable1.col1 = Table1.col5") where unknown
comparison will be ignored. The result goes into Spool 1
(all_amps), which is redistributed by hash code to all AMPs. Then
we do a SORT to order Spool 1 by the sort key in spool field1.
The size of Spool 1 is estimated with no confidence to be 118,105
rows. The estimated time for this step is 0.68 seconds.
5) We do a MERGE Update to Table1 from Spool 1 (Last Use) via ROWID.
6) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.

Explanation(WITHOUT NOT EXISTS)
1) First, we lock a distinct "pseudo table" for write on
a RowHash to prevent global deadlock for Table1.
2) Next, we lock Table1 for write.
3) We do an all-AMPs UPDATE from Table1 by
way of an all-rows scan with a condition of (
"(Table1.col4 = 'H') AND
((Table1.col2 <> 'N') OR
((Table1.col3 <> 'NC ') AND
((Table1.col3 <> 'NB ') AND
(Table1.col3 <> 'NA '))))").
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.

Thanks for your help

Karen
5 REPLIES
Enthusiast

Re: Teradata Perfomance

Could it be just one time it happenned or you can repro this at will ?
Enthusiast

Re: Teradata Perfomance

Hi,

The code is slower everytime I run it. I have run the before and after code on different machines and data and turns out the code is consistently slower.

It doesnt make much sense seeing as the table I removed never gets populated!

Any ideas welcome

Thanks

Karen
Enthusiast

Re: Teradata Perfomance

The only thing I see is the way it is going after the WHERE clause (filter condition).

with NOT EXISTS clause

...
4) We do an all-AMPs JOIN step from Table1
by way of an all-rows scan with a condition of (
"((Table1.col2 <> 'N') OR
((Table1.col3 <> 'NC ') AND
((Table1.col3 <> 'NB ') AND
(Table1.col3<> 'NA ')))) AND
(Table1.col4= 'H')"), which is joined to tmpTable1 by way
of an all-rows scan with no residual conditions.
Table1 and tmpTable1 are joined using an exclusion
merge join, with a join condition of (
"tmpTable1.col1 = Table1.col5") where unknown
comparison will be ignored. The result goes into Spool 1
(all_amps), which is redistributed by hash code to all AMPs. Then
we do a SORT to order Spool 1 by the sort key in spool field1.
The size of Spool 1 is estimated with no confidence to be 118,105
rows. The estimated time for this step is 0.68 seconds.
...

Explanation(WITHOUT NOT EXISTS)
...
3) We do an all-AMPs UPDATE from Table1 by
way of an all-rows scan with a condition of (
"(Table1.col4 = 'H') AND
((Table1.col2 <> 'N') OR
((Table1.col3 <> 'NC ') AND
((Table1.col3 <> 'NB ') AND
(Table1.col3 <> 'NA '))))").

The plan certainly looks different and I can tell from my vast Oracle experience with SQL tuning, this could be a legit reason for query slowdown in Oracle, but with Teradata I am just an infant :-).
Enthusiast

Re: Teradata Perfomance

The order of checking the columns for literals is different between the 2 scenarios.
Enthusiast

Re: Teradata Perfomance

Hi,

Thanks for the response.

I reordered the literals but this did not make any difference to the performance of the query, even though the explain plans show the same ordering.