Duplicate entry tracking which FastLoad does not load

Tools & Utilities
Enthusiast

Duplicate entry tracking which FastLoad does not load

Hi,

I want to know the count of each duplicate row, that the fastload omit while loading the data to a teradata table.

I am interested in following information:
1. What is the row which is having duplicates?
2. What is the count of duplicates for each row having duplicates?

I am using a MULTISET table with a column as NUPI.

I am not sure if the internal logic of fastload uses an update/insert for eliminating duplicates. I am analyzing in a direction to know the statistics of operations done on a particular row index in a table.[assuming the possibility of getting no. of scans, etc., information]
I am still not sure if there is any possibility to get this.

Can someone ease my life by showing me the right pointer?

If the solution is already available and can be shared then it will be a bonus to me. :-)

Thanks 'n' Regards,
3 REPLIES
Enthusiast

Re: Duplicate entry tracking which FastLoad does not load

Hi,
In Fastload the duplicate rows..will be tracked and loaded in the error table..which contain three columns err code ,err fieldname,dataparcel.
The Count of the duplicate rows will be displayed the log of the fastload script...how many records are insert...how many has err ..how many has duplicate values....i
f u want more details abt this pls check the Teradata documentation

Enthusiast

Re: Duplicate entry tracking which FastLoad does not load

Actually, Fastload does not store duplicate rows in error table 2. It will only put violations of a UPI into the 2nd error table. Duplicate rows that are encountered when you have a NUPI on the table are discarded, but a count is kept of how many duplicate rows were encountered. The count is then reported in the output, but you have no way of knowing which rows were dups.

As one alternative, you can put all of the columns in your table in the UPI (which may not make sense for other reasons). That will then send a duplicate row to the error table since it violates the UPI.

You can also do a 2-step process whereby you first create a MULTISET table and use Multiload to load that table (Multiload does not discard duplicate rows). Then you could do an aggregation with a count to determine your duplicate rows, and then, do an INSERT/SELECT into a SET table to remove the duplicate rows.

Enthusiast

Re: Duplicate entry tracking which FastLoad does not load

Thanks for the replies. They are informative.

Cheers :-)