Remove Duplicate Based on Max Value

Database
Enthusiast

Remove Duplicate Based on Max Value

Trying to remove a row that has a duplicate primary key, but the other values in the same record value. Want to remove duplicate based on the max value of one column (in this case, want distinct primary key records, with the maximum LOAD_DATE. 

 

Original Data

 

Primary_Key,First_Name,Last_Business_Name,Year,Month,Charge,Tax,LOAD_DATE
450,Paul,Allen,2015,3,164.97,24.3,3/5/2015
451,Eddie,Vedder,2015,3,225.02,14.7,3/12/2015
452,Paul,Stanley,2015,3,110.95,10.46,3/26/2015
452,Paul,Stanley,2015,3,109.85,14.46,3/18/2015
453,Tom,Morello,2015,3,104.83,24.1,3/8/2015

 

Expected Outcome

 

Primary_Key,First_Name,Last_Business_Name,Year,Month,Charge,Tax,LOAD_DATE
450,Paul,Allen,2015,3,164.97,24.3,3/5/2015
451,Eddie,Vedder,2015,3,225.02,14.7,3/12/2015
452,Paul,Stanley,2015,3,110.95,10.46,3/26/2015
453,Tom,Morello,2015,3,104.83,24.1,3/8/2015

 

Any help is appreciated!

2 REPLIES
Teradata Employee

Re: Remove Duplicate Based on Max Value

code to delete records that do not have the last load_date for a primary_key is pretty straight forward

 

delete from table

where (primary_key, load_date) not in

(

select primary_key, max(load_date)

 from table

group by 1

);

 

Re: Remove Duplicate Based on Max Value

delete TEST_TABLE where (primary_key, LOAD_DATE) not in ( select primary_key, max(load_date) from TEST_TABLE group by 1 having count(1) > 1 );