I am trying to tune a query which has been badly affected by the use of NOT IN Keyword over a very latge table which contains billions of rows.my obervation for the query suggests that there is only way to tune the query is to replace the NOt in with some alternatives.i tried using the NOT exists keyword but no luck.then i tried left outer join alongwith filter condition column is not null(although the columns are already a not null column) in place of not in and that was awesome.but the problem here i am seeing differences in results.
is that any scenarion where there is a differnece between the functinality of not in and left outer join.as per my understanding both should behave the same way.
Can anyone elaborate something over here.
It got tuned by Multicolumn Stats.
But apart from the query,functionality wise can we use left outer join in place of not in.are there possible scenarios for that to happen in Teradata.
They should return same dataset if the queries are written correctly. Here is a very simple example:
CREATE VOLATILE TABLE TABA ( COLA INTEGER ) NO PRIMARY INDEX ON COMMIT PRESERVE ROWS; CREATE VOLATILE TABLE TABB ( COLA INTEGER ) NO PRIMARY INDEX ON COMMIT PRESERVE ROWS; INSERT INTO TABA VALUES (1); INSERT INTO TABA VALUES (2); INSERT INTO TABA VALUES (3); INSERT INTO TABA VALUES (4); INSERT INTO TABB VALUES (2); INSERT INTO TABB VALUES (3); SELECT * FROM TABA WHERE COLA NOT IN (SELECT COLA FROM TABB); SELECT TABA.* FROM TABA LEFT OUTER JOIN TABB ON TABA.COLA = TABB.COLA WHERE TABB.COLA IS NULL;
Sounds like a good example & results are same also.but will the functionality of not in and left outer join be same in all scenarios provided the queries have been written correctly.
Can you share the DDL & actual Select?
NOT IN might be really bad, but I wouldn't expect NOT EXISTS performing worse than LEFT JOIN/IS NULL.
What's your Teradata release, 15.10?
Re the question about subquery conditions versus joins:
The two are not semantically equivalent. A subquery condition will never result in more than one result row from each source row. A join however can result in an arbitrarily large number of result rows from one source row if there are duplicates of the join column values in the joined table.
If you want to tune a query, do Explains to see what the options actually do. Using the simple example above, the outer join requires far fewer steps that not-in. But MINUS uses an even simpler plan:
select * from taba
select * from tabb;