I have a database with many tables (around 60-70) where around half the number of tables, or slightly more, have a small amount of duplicate rows. (Between 0-3% duplicate rows on average in a table). I know the best case is to have 0% of duplicate rows, however I was wondering, if there is some rule of thumb or industry standard regarding an acceptable amount of duplicate rows in a table.
Would 2% duplicate rows be considered a big problem or would this be somewhat acceptable in most cases in the industry.
Agree with Dieter, however in some cases (I have seen them in Telco industry) dups are functionally required, the number of acceptable dups in such case depends to the number of hash collision you may have with your PI, If the number of hash collision on your PI is less than 100, then I would accept a maximum of 100 dups in the multiset table.