I need help in optimizing the query below: Query is about identifying UK of deletes in a large table (70+ GB) using the pre-load-image and post-load-image. Here is the query we have written but this consume a ton of CPU.
The two tables are joined on the UK in the query below:
Need urgent help for performance improvement or any other alternatives, immediate responses would be greatly appreciated.
LOCK TABLE pre_load_image FOR access LOCK TABLE post_load_image FOR access SELECT 'sap_clnt_id=' || COALESCE( A.sap_clnt_id , '' ) || '~co_cd=' || COALESCE( A.co_cd , '' ) || '~fin_doc_nbr=' || COALESCE( A.fin_doc_nbr , '' ) || '~fscl_yr_nbr=' || COALESCE( A.fscl_yr_nbr , '' ) || '~fin_doc_line_nbr=' || COALESCE( A.fin_doc_line_nbr , '' ) FROM pre_load_image A WHERE NOT EXISTS ( SELECT sap_clnt_id , co_cd , fin_doc_nbr , fscl_yr_nbr FROM post_load_image B WHERE A.sap_clnt_id = B.sap_clnt_id AND A.co_cd = B.co_cd AND a.fin_doc_nbr = b.fin_doc_nbr AND A.fscl_yr_nbr = b.fscl_yr_nbr AND A.fin_doc_line_nbr = b.fin_doc_line_nbr ) ;
I presume all the joining columns are part of PI in both the tables. From an excerpts of a post from Mr. Dieter ( a supreme supreme being) I remember that COALESCE on nullable PI columns results in redistibuting the whole table. --> This may be one area you may look into.
Another way you may look into it by changing the NOT Exists to an outer join and then filter for nulls. You need to compare the explain plans though. Generally NOT exists is better than outer join. But based on your table structure ( if the joining columns are PI) the join should not be a bad idea.
Let's go back to basics. Maybe the SQL is fine and the problem lies elsewhere.
What is the Primary Index of each table? Could it be extremely non-unique or highly skewed? Do both tables have the same PI? Are all the PI fields referenced in the join criteria? Have you collected stats on the PI of both tables? Have you done an EXPLAIN? Are the fields that make up your "unique key" actually nullable, or NOT NULL (in which case you don't need COALESCE)?
And finally, is the issue that the query takes a long time to run, or that it consumes what seems like a lot of CPU?