Hi all, I just have a question about Tpump and the IGNORE DUPLICATES operation.
I want to insert rows into a table using tpump, and I have not set IGNORE DUPLICATES. Now say a row already exists in the table and tpump is going to insert an exact duplicate of the row. Would it still insert that row? Or does it only not insert duplicate rows that's coming from the source?
Regarding the speed of tPump. If i try to insert say 1-10k rows into the multiset table it takes around 2 seconds to complete the operation. However if I was to insert the rows into a set table.. how much would it affect the completion time? would it be something like 30 seconds? a minute? I'm just trying to get my head around different options.
My first answer was just for the question about what constitutes a duplicate row and whether the duplicate rows would have to come from the source.
According to the TPump manual (please look up the information about IGNORE DUPLICATE):
"A row is a duplicate row if all column values in the row are the exact duplicate of another row. Duplicate row checking is bypassed if the table is a multiset table (which allows duplicate rows), or if the table has one or more unique indexes (the uniqueness test(s) make any duplicate row check unnecessary); in these cases, IGNORE DUPLICATE ROWS has no effect. Any uniqueness violations will result in the offending rows going to the error table."
Because your table is a multiset table, it looks like the duplicate row check is ignored even if you specify IGNORE DUPLICATE.
As for performance, I do not have specific performance knowledge for TPump. There are a lot of factors that affect performance.