I wanted a query to delete duplicate (retain ANY 1 row) in Teradata table, without creating any volatile table or complex logic. Wanted your inputs in this regard. In oracle we can do this with “row-id”, I was wondering if we can have access to “Primary Index” or “hash value” for a column?
I want a statement like: Delete my.table where primary-key-columns-or-all-column-list not in (Select 1 duplicate row from my.table )
Query to find Duplicates:
Select primary-key-columns-or-all-column-list, count(*) From my.table Group by primary-key-columns-or-all-column-list Having count(*) > 1;
Possible solutions in Teradata:
Do an INSERT-SELECT into another table with the group-by-all-columns (or EXCEPT logic) Or utility load into a SET table instead of MULTISET so as to not permit duplicate rows in the future.
create set table t2 as t1 with data;
T1 is multiset table with duplicate rows. This sql will create t2 table without duplicate rows of t1 table.