I am working on an application that takes records from Hadoop and inserts them into Teradata via sqoop(JDBC).
I am using TERA mode for the connection and the target table is SET table. I am getting duplicate row issue for some datasets.
As far as I know TERA mode is supposed to ignore duplicate records while doing inserts. Can somebody please confirm the behavior?
It has nothing to do with Transaction mode (TERA/ANSI), It's all about the table definition (SET/MULTISET)
soumyajit, you were misinformed. The Teradata Database never "ignores" duplicate records when doing inserts, no matter which transaction mode (ANSI vs TERA) you are using.
As Carlos said, the table attribute (SET vs MULTISET) governs whether duplicate rows are permitted in a table. With a MULTISET table, you are permitted to insert duplicate rows using either ANSI or TERA mode.
In contrast, with a SET table, if you try to insert duplicate rows, you will get Teradata Database error 2802 "Duplicate row error in <table>" using either ANSI or TERA mode. In other words, the transaction mode does not change this behavior.