Key Level Duplicates

UDA
Enthusiast

Re: Key Level Duplicates

OK. I think I understand better what's going on now. You just want to get the first occurance (for some definition of first) for each key value pair, regardless of how many rows have that key.

You may want to break it up into a couple of steps and load the results into a temporary table along the way. The first step might be to take just the rows that do not have duplicates and put them in your "results set" table. The next step would be to identify only the rows that have duplicate keys (via a subquery similar to the one I proposed) and then perform the ROW_NUMBER filter on that. It should (hopefully) minimize your spool space issues if you use a Global Temporary Table to hold the result set.

How large is the table you're doing this on and how many duplicate values do you expect? Is there any other way to identify the first/correct row to return (effective dates, logical delete indicator, insert timestamp, etc)?
Fan

Re: Key Level Duplicates

You can do this in the below way and this is one way of doing it.Hope it will be helpful to you

ct tab_a(
col1 int
,col2 varchar(10)
,col3 char(1));

ins tab_a(1,'ashok','y');
ins tab_a(1,'ashok','n');
ins tab_a(2,'mnylin','y');

alter table tab_a add flag int;

ins tab_a(col1,col2,col3,flag)
sel col1
,col2
,col3
,2
from tab_a qualify row_number() over(partition by 1 order by 1) = 2;

delete from tab_a t1
where t1.flag is null
and exists(sel null from tab_a t2
where t1.col1 = t2.col1
and t2.flag = 2);

alter table tab_a drop flag;

sel * from tab_a

1 ashok n
2 mnylin y

Enthusiast

Re: Key Level Duplicates

you can try

1:

sel col1,col2,max(col3),max(col4),max(col5)

from tab a

group by 1,2,3

/** Yes this would not alwys give consistent records as data from two rows may be combined **/

2:

or you can use your query for only dups to avoid spool space

Sel col1..coll5

from tab a

where (col1,col2 )

not in

(

sel

col1,

col2

from tab a group by 1,2 having count(*) >1 )

union all

sel col1..col5

from

(

Sel col1..coll5

from tab a

where (col1,col2 )

 in

(

sel

col1,

col2

from tab a group by 1,2 having count(*) >1 )

) x

QUALIFY ROW_NUMBER() OVER (partition by col1,col2 order by 1) = 1;

/** Query became bit messy ,you can probably improve it with temp tables or  using 'with ' clause **/

Enthusiast

Re: Key Level Duplicates

Try executing original query again after collecting multi column stat on col1 and col2,Might work.