Query Performance Optimization Help

UDA

Query Performance Optimization Help

I need help in optimizing the query below:
Query is about identifying UK of deletes in a large table (70+ GB) using the pre-load-image and post-load-image. Here is the query we have written but this consume a ton of CPU.

The two tables are joined on the UK in the query below:

Need urgent help for performance improvement or any other alternatives, immediate responses would be greatly appreciated.

LOCK TABLE pre_load_image FOR access
LOCK TABLE post_load_image FOR access
SELECT 'sap_clnt_id=' || COALESCE( A.sap_clnt_id , '' ) || '~co_cd=' || COALESCE( A.co_cd , '' ) || '~fin_doc_nbr=' || COALESCE( A.fin_doc_nbr , '' ) || '~fscl_yr_nbr=' || COALESCE( A.fscl_yr_nbr , '' ) || '~fin_doc_line_nbr=' || COALESCE( A.fin_doc_line_nbr , '' )
FROM pre_load_image A
WHERE NOT EXISTS (
SELECT sap_clnt_id , co_cd , fin_doc_nbr , fscl_yr_nbr
FROM post_load_image B
WHERE A.sap_clnt_id = B.sap_clnt_id
AND A.co_cd = B.co_cd
AND a.fin_doc_nbr = b.fin_doc_nbr
AND A.fscl_yr_nbr = b.fscl_yr_nbr
AND A.fin_doc_line_nbr = b.fin_doc_line_nbr ) ;

Thanks in advance!
_Sree
3 REPLIES
Enthusiast

Re: Query Performance Optimization Help

I presume all the joining columns are part of PI in both the tables. From an excerpts of a post from Mr. Dieter ( a supreme supreme being) I remember that COALESCE on nullable PI columns results in redistibuting the whole table. --> This may be one area you may look into.

Another way you may look into it by changing the NOT Exists to an outer join and then filter for nulls. You need to compare the explain plans though. Generally NOT exists is better than outer join. But based on your table structure ( if the joining columns are PI) the join should not be a bad idea.

Give a try & good luck
Enthusiast

Re: Query Performance Optimization Help

Try this. A Minus often works faster than a "Not In/ Not Exists"

LOCK TABLE pre_load_image FOR access
LOCK TABLE post_load_image FOR access
SELECT 'sap_clnt_id=' || COALESCE( A.sap_clnt_id , '' ) ||
'~co_cd=' || COALESCE( A.co_cd , '' ) ||
'~fin_doc_nbr=' || COALESCE( A.fin_doc_nbr , '' ) ||
'~fscl_yr_nbr=' || COALESCE( A.fscl_yr_nbr , '' ) ||
'~fin_doc_line_nbr=' || COALESCE( A.fin_doc_line_nbr , '' )
FROM
( SELECT A.sap_clnt_id
, A.co_cd
, a.fin_doc_nbr
, A.fscl_yr_nbr
, A.fin_doc_line_nbr
FROM pre_load_image A
MINUS
SELECT B.sap_clnt_id
, B.co_cd
, B.fin_doc_nbr
, B.fscl_yr_nbr
, b.fin_doc_line_nbr
FROM post_load_image B )
;
Teradata Employee

Re: Query Performance Optimization Help

Let's go back to basics. Maybe the SQL is fine and the problem lies elsewhere.

What is the Primary Index of each table? Could it be extremely non-unique or highly skewed?
Do both tables have the same PI?
Are all the PI fields referenced in the join criteria?
Have you collected stats on the PI of both tables?
Have you done an EXPLAIN?
Are the fields that make up your "unique key" actually nullable, or NOT NULL (in which case you don't need COALESCE)?

And finally, is the issue that the query takes a long time to run, or that it consumes what seems like a lot of CPU?