We have a SCD 2 table PARTY_ID_HIST. It has a surrogate key named Party_sk based on the source system primary key party_id_num. Due to some code issue the table has been populated repeating values and values pre-fixed with 0 for column party_id_num. Due to this there are new surrogate key values generated for column party_sk.
I have been told to remove duplicate entries. There are two challenges:
1. I dont to delete duplicate entries while maintaining historical record of SCD 2. That means if there are new surrogate keys generated due to change in some other attributes, those records should not be removed.
2. What to do with wrong Party_sk column which have been populated in other tables?
First thing is job security. Take back up always.
You have not told about the SCD 2 , whether it is flag, version, start date end date or a combination of these. Sample data ??????
Once you have a backup, you can do whatever you want.