Need help on deleting duplicate records of a intraday run

Database

Need help on deleting duplicate records of a intraday run

Hi All,

I have SET table, which is of only insert type (daily feed). It's a stage table and loading from a flatfile. Now, I have make it as a intraday run and  need to pick only the data loaded for a particular instance, process it and load to a final table.

For this purpose, i have added a flag column which will be updated as 'N' each time when the load runs and a 'Y' value will be inserted for the new records of that particular instance.

As I made this change, duplicate records (one with Flag 'Y' and othe with Flag 'N') got inserted for the first two instances.

For the third instance, when I am trying to update the Flag as 'N', it throwing a duplicate error-2802 (which is obvious as this is a SET table).

Can anyone suggest a solution how i can come out of this problem and can handle the intraday run?

I have a option of converting the table to MULTISET to handle this. I want to know whether this is a optimum solution with no side effects. Please suggest

Thanks, Prasanth

2 REPLIES
N/A

Re: Need help on deleting duplicate records of a intraday run

if it is a stage table, we can delete the records after they are loaded into main table right? so your table will only store records which are only need to be processed.

load new records with flag = 'Y' in intra day mode, when you want to process the records update all the records to 'N' and process all the 'N' records to main table, after sucessful process delete all the 'N' Records in stage table. now your table is ready for next instance.

Re: Need help on deleting duplicate records of a intraday run

Hi Prasanth,

You can have a load id in your stage table to differentiate history records and new load records. Kind of PROCESS_ID or REC_CRE_JOB_NUM instead of 'Y' and 'N' flags. From then if its only insert then you can extract the latest load id's records to your target table.

Example:

Load1 - 

col1  Col2  Col3     Process_Id

a       b      c         1

Load2 - 

col1  Col2  Col3     Process_Id

a       b      c         2


Load3 - 

col1  Col2  Col3     Process_Id

a       b      c         3

In this example your process_id is different for each run and it wont result in duplicate now.

Thanks!

-Sakthi