Key Level Duplicates

UDA

Key Level Duplicates

Hi All,

I was trying to select all the columns which are not key level duplicates.For that I have written a query as

SELECT *
FROM Tab_a
QUALIFY ROW_NUMBER() OVER (partition by col1,col2 order by 1) = 1;

This gives the desired result.However job fails due to no more spool space error.Do you have any idea how to tune this query?

13 REPLIES

Re: Key Level Duplicates

did you try something like...
select
key_field_1
,key_field_2
,count(*)
from
table_a
having count(*)=1
group by 1,2
--for the non duplicates, if you'd like to find the duplicates change the '=' to '>' without the ticks.

Re: Key Level Duplicates

Thanks for the suggestion.

However I have got 5 columns col1,col2,col3,col4,col5.Out of which I want to select all the columns by doing the grouping at key level i.e, col1 and col2.So the above query does nt work in this scenario Where I want to select all columns and identifying the duplicates at column level.

Re: Key Level Duplicates

Hi,

if there is duplicate records available in table then only we retrieve those rows but teradata database didn't allow duplicate records then how the above query will works?

Please clarify this doubt.

Re: Key Level Duplicates

Ashok's response first: simply include all the rows if its a not a very wide table and add all rows to the group by and specify having count(*)>1, this will show you the duplicate rows and their population.

Udaya..the query will work but return no rows if you've specified "having count(*)>1", if there aren't any duplicates for the specified column set, it will return no rows.

Re: Key Level Duplicates

Hi Elvis,

Do u mean to say that I have to write a query like this?

Sel col1,col2,col3,col4,col5
from Tab1
group by 1,2,3,4,5
having count(*)>1 ?

But this will filter the whole duplicate row.I want to select all the rows by filtering the duplicate rows at only 2 columns i.e col1 and col2.I mean I want a result very similar to

SELECT *
FROM Tab_a
QUALIFY ROW_NUMBER() OVER (partition by col1,col2 order by 1) = 1;
N/A

Re: Key Level Duplicates

Would this work?

SELECT * FROM Tab_A WHERE (col1, col2) IN (SELECT Col1, Col2 FROM TAB_A GROUP BY 1,2 HAVING COUNT(*) = 1);

Re: Key Level Duplicates

This wont work because..subquery will eliminate the duplicates based on col1 and col2 though....Main query will retrieve all the rows along with duplicates because of in condition...
N/A

Re: Key Level Duplicates

In the query I posted, the sub query identifies all combinations of col1/col2 that occur once and only once. This is then fed as a filter condition to the outer query to retrieve all data rows with those values in col1/col2. Based on your original questions, this should give you what you're looking for. This will return all rows from Tab_A that are not key-level duplicates (assuming the key is col1/col2). If a particular col1/col2 combination occurs multiple times, none of that key values associated rows will be output from the query.

Maybe I'm a little confused then on what you're trying to do. Do you want to see the full rows that have duplicate values for col1 and col2? Or only the rows that do not have duplicate values
for col1 and col2?

Re: Key Level Duplicates

Hi Mnylin,

Sorry for not being clear.For eg: We have a table Tab_a

Col1 Col2 Col3

1 Ashok Y
1 Ashok N
2 Mnylin Y

The output for the below query will be:

SELECT *
FROM Tab_a
QUALIFY ROW_NUMBER() OVER (partition by col1,col2 order by 1) = 1;

Col1 Col2 Col3

1 Ashok Y
2 Mnylin Y

The output for the query which you gave will be:

SELECT * FROM Tab_A WHERE (col1, col2) IN (SELECT Col1, Col2 FROM TAB_A GROUP BY 1,2 HAVING COUNT(*) = 1);

Col1 Col2 Col3

2 Mnylin Y

Here I dont want to exclude duplicates.I want to select a row even if we have duplicates and exclude the other duplicate at key level(col1/col2).I want to fetch the result which is very similar to query1.I hope this is clear.