Code check

Database
Enthusiast

Code check

I have an existing code here and as per my understanding we need not check for C.STATUS because if there is no record for C.TID , then there will of course be no Status. Do you think we will ever need to check the status here ?
SELECT DISTINCT A.ID, A.TID
FROM table1 A
INNER JOIN table2 B
ON (A.ID = B.ID AND B.STATUS='OPEN')
LEFT JOIN table3 C
ON (A.TID = C.TID AND C.STATUS = 'enable' )
WHERE C.TID IS NULL

Thank you!
2 REPLIES
Enthusiast

Re: Code check

When you join tables, then teradata selects rows, that has not null values in join-fields.
You can see this if you look at the query plan of your query.

For example:
create table zzz(a int,b int);
insert into zzz (a,b) values (1,null)
create table zzz2(c int,b int);
insert into zzz2 (c,b) values (3,null)
select * from zzz t1 left join zzz2 t2 on t1.b=t2.b

Has plan:
Explain select * from zzz t1 left join zzz2 t2 on t1.b=t2.b

...
2) We do an all-AMPs RETRIEVE step from t2 by way of
an all-rows scan with a condition of ("NOT (t2.b
IS NULL)") into Spool 3 (all_amps), which is redistributed by
the hash code of (t2.b) to all AMPs. Then we do a
SORT to order Spool 3 by row hash. The size of Spool 3 is
estimated with no confidence to be 90 rows (1,890 bytes).
The estimated time for this step is 0.01 seconds.
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a
RowHash match scan, which is joined to Spool 3 (Last Use) by way
of a RowHash match scan. Spool 2 and Spool 3 are left outer
joined using a merge join, with a join condition of ("b = b").
The result goes into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with no confidence
to be 110 rows (4,070 bytes). The estimated time for this step is
0.02 seconds.
...

So, your table3 is useless in this query...

ps:
Your query can be rewrote as:
SELECT A.ID, A.TID
FROM table1 A INNER JOIN table2 B ON A.ID = B.ID
WHERE B.STATUS='OPEN'
GROUP BY A.ID, A.TID
Junior Contributor

Re: Code check

If you remove the second join, you'll get a totally different result set.

The first join retrieves all IDs from A where there's an OPEN status in B.
The second join + WHERE removes those IDs where there's no ENABLE status in C.

Rewritten using EXISTS (i don't now if you still need the DISTINCT)
SELECT A.ID, A.TID
FROM table1 A
WHERE EXISTS
(
SELECT * FROM table2 B
WHERE A.ID = B.ID
AND B.STATUS='OPEN'
)
AND NOT EXISTS
(
SELECT * FROM table3 C
WHERE A.TID = C.TID
AND C.STATUS = 'enable'
)

Dieter