How to update the duplicate records in Teradata

Database

How to update the duplicate records in Teradata

I have a requirement to update the duplicate records in teradata. For E.G. if col, col, col, col have same values in the table, those records are regarded as duplicates and duplicat...

5 REPLIES
Enthusiast

Re: How to update the duplicate records in Teradata

Hi Siddesh,

Is this your requirement - if there are 3 rows in a table having the same data in all the columns then you need

to mark only 2 of them as duplicates and one remains unchanged or you want to mark all 3 rows as

duplicates ?

if possible please provide a sample example

Regards

R.Rajeev

Enthusiast

Re: How to update the duplicate records in Teradata

Hi Siddesh,

As commented by rajeev, sample would provide more clarity on your requirement.

on a brief note:

You can use a self left join to table with left joining table to be a derived table on main table with duplicated PI (if you are checking for complete duplicated row) and you can use case statements to compare the columns to mark the duplicate flag.

cheers 

Jagdish

Senior Apprentice

Re: How to update the duplicate records in Teradata

Hi Siddesh,

following sets the duplicate flag for all duplicate rows

UPDATE tab

SET dupflag = 1

WHERE (col1,col2,col3) IN

(SELECT col1,col2,col3 

 FROM tab

 GROUP BY 1,2,3

 HAVING COUNT(*) > 1

)


Of course those columns should be NOT NULL.

Depending on the percentage of duplicate rows it might be more efficient to the the calculation during an insert/select using an OLAP function:

CASE WHEN COUNT(*) OVER (PARTITION BY col1,col2,col3) > 1 THEN 1 ELSE 0 END

Dieter

Re: How to update the duplicate records in Teradata

What should be the UPDATE statement in case of below requirement.

Requirement - if there are 2 perfect duplicate rows in a table having the same data in all the columns then i need to update column1 of one row only. 

Enthusiast

Re: How to update the duplicate records in Teradata

You can use row_number to assign different numbers to each perfectly duplicate rows and then update 1 columns of 1 row by putting filter condition like row_number = 1