how to delete duplicate records

Training
Enthusiast

how to delete duplicate records

Hi,
Can anyone suggest me on how to delete the duplicate records from a teradata table.I should not create a new similar table,but delete from the existing table only.

thanks,
Nagendhran
11 REPLIES
Enthusiast

Re: how to delete duplicate records

If you're saying you cannot create a new table at all to assist you in removing dups, I don't think there is a way to do that. I would suggest 3 options:

1) Insert/select into a new SET table. Then drop the old table and rename the new one to the old name.

2) Insert/select into a new table grouping by (or using distinct) on all columns. Then drop the old table and rename the new one to the old name.

3) Insert only the dups into a 2nd SET table (by doing a group on all columns with a HAVING COUNT(*) > 1). Then do a delete/join (joining the dups table back to the original table), followed by an insert/select from the dups table into the first (in the same unit of work if you need to). This will at least keep the existing table intact, but the dups will be temporarily deleted before one copy of them is inserted back in.

Enthusiast

Re: how to delete duplicate records

Another option that I thought of after my last post is to export the dup rows (with a GROUP BY on all columns and HAVING COUNT(*) > 1) to a file. Then, using Mload or Tpump to bring the rows back in and do a delete and insert for each row.

With this method, there is no need for a 2nd table and you won't see all of the dups rows disappear from the table at the same time. The re-insert of each dup row will occur immediately after the deletion of the rows.

Enthusiast

Re: how to delete duplicate records

Export the records to a flatfile and then load using fastload. Fastload does not load the duplicate records.
Enthusiast

Re: how to delete duplicate records

Thanks barry,ramakrishna for your suggestions.It really helped me a lot.
rgds.,
Nagendhran

Re: how to delete duplicate records

Here is a good tool to find and delete duplicate files
[url] http://www.duplicateFilesDeleter.com [/url]

Re: how to delete duplicate records

I have tried www.duplicateFilesDeleter.com and it does work.
Enthusiast

Re: how to delete duplicate records

del from yourtablename
where primary index column in
(
sel primary index column from yourtablename
qualify row_number()over(partition by primary index order by some column) > 1
)

.. it will keep one record and deletes all the other. you do not need to create any seperate table for this
Junior Contributor

Re: how to delete duplicate records

Did you notice, that the two previos messages where kind of spam and the original post was four years old?

Btw, your solution will not work, but remove all duplicate rows.

Dieter
Enthusiast

Re: how to delete duplicate records

How to delete duplicate Records from TD table,

here is multistep approach(more of a workaround)

create multiset table delete_duplicates(

col1 integer,

col2 integer,

)

Now insert duplicate rows.

Alter table to add new column rank1(give any name you like);

Now run the following insert and delete statement,

insert into delete_duplicates

select col1,col2,row_number() over(partition by col1,col2 order by col1,col2 )

from delete_duplicates;

delete from delete_duplicates where rank1 is null or rank1>1;