Use of NOT IN for a fact join

Database

Use of NOT IN for a fact join

Hi,

here is my requirement.

I have a transactional fact table with 700 mil records for a month. I need to create a new table (say FACT_NEW) same as fact and load it for all months of the year such that -

if I am running for October 2013, I will compare all months of the year (till Jan 2013) one by one with October data and load record in FACT_NEW if record is not present for that month.

My solution for this is below statement in a SP to be executed in a loop -

Insert into  FACT_NEW

select (..) from FACT where date=October and key_val NOT IN (select .. from FACT where date = date -1);

This is taking a lot of time. Can anyone help me in optimizing this ?

Regards,

Nishchint

8 REPLIES

Re: Use of NOT IN for a fact join

Guys, any sugestions ??

Senior Supporter

Re: Use of NOT IN for a fact join

I don't get fully what you wanne achive

Can you explain with some data?

also not exists shows better performance

select *
from fact o
where date_col / 100 + 190000 = 201310
and not exists (select * from fact i where i.key_val = o.key_val and date_col / 100 + 190000 = 201309 )

Re: Use of NOT IN for a fact join

I will take a simpler example for March month -

FACT - original fact table

FACT_NEW - new fact table

If I am running for March 2013, I will compare as below -

1st iteration - compare FACT (march) with FACT_NEW (feb) data --> and load records in FACT_NEW if not present with date as Feb.

2nd iteration - compare FACT (march) with FACT_NEW (Jan) data --> and load records in FACT_NEW if not present with date as March.

I am using below statement now (this is for Oct) -

INSERT INTO FACT_NEW                          

(..)

select ..

FROM FACT A

WHERE

date = recDate

AND NOT EXISTS ( SELECT 1 FROM FACT B  WHERE Date= next_date);

--where for 1st Iteration recDate = '2013-10-31' and next_date = '2013-09-30'

Query now is running better but is still taking a lot of time since this is to be repeated for all months till Jan 2013.

Can we fine tune it further ?

Senior Supporter

Re: Use of NOT IN for a fact join

What is the PI of the fact table

Enthusiast

Re: Use of NOT IN for a fact join

I think you can convert this subquery to a set operation using EXCEPT Set operator. It will improve a lot of performance. 

Khurram
Junior Contributor

Re: Use of NOT IN for a fact join

Hi Nishchint,

did you consider using a MERGE instead?

Match on the logical PK and next_date and do an Insert only WHEN NOT MATCHED. This avoids the NOT IN and might do a direct merge without a spool.

Teradata Employee

Re: Use of NOT IN for a fact join

Am still not clear .... 

if I am running for October 2013, I will compare all months of the year (till Jan 2013) one by one with October data and load record in FACT_NEW if record is not present for that month.

If data is for October 2013, why are you comparing it with other months data? Or you just dont want to insert duplicate rows which might have same data but different dates?

Re: Use of NOT IN for a fact join

Hello all

thanks for your replies.

Ulrich - PI is cust_id, tran_no and date (PPI)

Khurram, Dieter - I tried both EXCEPT and MERGE and merge worked faster for me.

Adeel - its a reporting requirement where they need a a dummy fact with Dummy records.

I will go with merge for now. :)