Remove "Duplicate" rows

Analytics
Enthusiast

Remove "Duplicate" rows

Hi,

 

I have a table with the following format 

SITECATIDVALUENAME1NAME2
06174016763038832012DVD Edition YearRelease Year
06174016763038832012Release YearDVD Edition Year
0155101401676303755Cubic ZirconiaMain StoneSecondary Stone
0155101401676303755Cubic ZirconiaSecondary StoneMain Stone
050677401676303734PewterMetalBase Metal
050677401676303734PewterBase MetalMetal

 

as you can see, every 2 rows are basiclly duplicates, just with opposite NAME1 and NAME2 values.

I want to remove these "duplicate" rows and keep only rows of one of the pairs.

I tried to use

row_number() over(partition by site, cat, id, value order by name1) as row_num

function but the issue is that i have rows that looks like this

SITECATIDVALUENAME1NAME2
046283401676303654NoNon-Domestic ProductModified Item
046283401676303654NoCustom BundleModified Item
046283401676303654NoNon-Domestic ProductCustom Bundle
046283401676303654NoModified ItemCustom Bundle
046283401676303654NoModified ItemNon-Domestic Product
046283401676303654NoCustom BundleNon-Domestic Product

 

where I have 3 different names for each ID and value (Modified Item, Custom Bundle and Non-Domestic Product).

When I'm using the partition by function i get only 1 row instead of 3 that i want to have.

 

Is there a way to do it in Teradata?

 

Thanks,

Yonatan


Accepted Solutions
Ambassador

Re: Remove "Duplicate" rows

Add both names to the partition using least/greatest:

 

qualify row_number() over(partition by site, cat, id, value, least(name1, name2), greatest(name1, name2) order by cat) = 1

 

1 ACCEPTED SOLUTION
2 REPLIES 2
Ambassador

Re: Remove "Duplicate" rows

Add both names to the partition using least/greatest:

 

qualify row_number() over(partition by site, cat, id, value, least(name1, name2), greatest(name1, name2) order by cat) = 1

 

Highlighted
Enthusiast

Re: Remove "Duplicate" rows

Thanks a lot!

 

It worked perfectly :)