Tpump and Duplicates


Tpump and Duplicates

Hi all, I just have a question about Tpump and the IGNORE DUPLICATES operation.

I want to insert rows into a table using tpump, and I have not set IGNORE DUPLICATES.
Now say a row already exists in the table and tpump is going to insert an exact duplicate of the row. Would it still insert that row? Or does it only not insert duplicate rows that's coming from the source?

Thanks for any help
Teradata Employee

Re: Tpump and Duplicates

Assuming that your table is a MULTISET table . . . .

The ability to insert duplicate rows into a table does not refer to the need for the duplicates to all be coming from the source.

Loading duplicates means a row is sent to the table that is an exact duplicate of a row already existing in the table.

Is that what you are looking for?
-- SteveF

Re: Tpump and Duplicates

Thanks feinholz,
I appreciate the quick response. I think I understand what you mean.
The table I want to insert into is indeed a multiset table.

Are you saying that if i have specified IGNORE DUPLICATES

it will still push the duplicate row into the table? even if an exact copy of the row already exists in the table?

Re: Tpump and Duplicates

I have another question aswell..

Regarding the speed of tPump. If i try to insert say 1-10k rows into the multiset table it takes around 2 seconds to complete the operation.
However if I was to insert the rows into a set table.. how much would it affect the completion time?
would it be something like 30 seconds? a minute?
I'm just trying to get my head around different options.

Teradata Employee

Re: Tpump and Duplicates

My first answer was just for the question about what constitutes a duplicate row and whether the duplicate rows would have to come from the source.

According to the TPump manual (please look up the information about IGNORE DUPLICATE):

"A row is a duplicate row if all column values in the row are the exact duplicate of another row. Duplicate row checking is bypassed if the table is a multiset table (which allows duplicate rows), or if the table has one or more unique indexes (the uniqueness test(s) make any duplicate row check unnecessary); in these cases, IGNORE DUPLICATE ROWS has no effect. Any uniqueness violations will result in the offending rows going to the error table."

Because your table is a multiset table, it looks like the duplicate row check is ignored even if you specify IGNORE DUPLICATE.

As for performance, I do not have specific performance knowledge for TPump. There are a lot of factors that affect performance.
-- SteveF

Re: Tpump and Duplicates

Thanks feinholz.
It's much appreciated