SET vs MULTISET Table with UPI/USI

Database
Enthusiast

SET vs MULTISET Table with UPI/USI

I understand that a SET table has an overhead of checking for duplicate rows while data is being inserted into the table. However, if the table has a UPI/USI defined on it, only the UPI/USI columns are checked for duplicates and not duplicate row checks. Is this understanding correct ?

 

Also, can anyone please explain if a SET table with a UPI/USI is functionally any different from a MULTISET table with a UPI/USI  ? Logically speaking both cases should be functionally similar since only UPI/USI checks will be performed and not duplicate row checks. In other words, will both cases exhibit the same functionality ? Will there be any difference in performance during data loading ?

 

Thanks.

1 REPLY
Enthusiast

Re: SET vs MULTISET Table with UPI/USI

If you have UPI or USI on a SET table it will still continue to do a Duplicate row Check, it will not stop with UPI or USI check alone.

Please go over the below cases which show cases subtle difference in the functionality.

CREATE SET TABLE SANDBOX.TEST_SET1
( C1 INT
, C2 INT
, C3 INT
) UNIQUE PRIMARY INDEX ( C1);

CREATE SET TABLE SANDBOX.TEST_SET2
( C1 INT
, C2 INT
, C3 INT
) PRIMARY INDEX ( C1)
UNIQUE INDEX (C2);

CREATE MULTISET TABLE SANDBOX.TEST_MSET1
( C1 INT
, C2 INT
, C3 INT
) UNIQUE PRIMARY INDEX ( C1);

CREATE MULTISET TABLE SANDBOX.TEST_MSET2
( C1 INT
, C2 INT
, C3 INT
) PRIMARY INDEX ( C1)
UNIQUE INDEX (C2);

CREATE MULTISET TABLE SANDBOX.TEST_SRC
( C1 INT
, C2 INT
, C3 INT
) PRIMARY INDEX ( C1);

INSERT INTO SANDBOX.TEST_SRC ( 1,2,3) ;
INSERT INTO SANDBOX.TEST_SRC ( 2,3,4) ;
INSERT INTO SANDBOX.TEST_SRC ( 5,6,7) ;

INSERT INTO SANDBOX.TEST_SET1
SEL * FROM SANDBOX.TEST_SRC;
-- 3 Rows Inserted sucessfully

INSERT INTO SANDBOX.TEST_SET2
SEL * FROM SANDBOX.TEST_SRC;
-- 3 Rows Inserted sucessfully

INSERT INTO SANDBOX.TEST_MSET1
SEL * FROM SANDBOX.TEST_SRC;
-- 3 Rows Inserted sucessfully

INSERT INTO SANDBOX.TEST_MSET2
SEL * FROM SANDBOX.TEST_SRC;
-- 3 Rows Inserted sucessfully

-- Loading Src table with more data,
SEL COUNT (*) FROM SANDBOX.TEST_SRC; -- 3 Rows
INSERT INTO SANDBOX.TEST_SRC ( 8,9,10) ;
INSERT INTO SANDBOX.TEST_SRC ( 11,12,13) ;
INSERT INTO SANDBOX.TEST_SRC ( 14,15,16) ;
INSERT INTO SANDBOX.TEST_SRC ( 17,18,19) ;
SEL COUNT (*) FROM SANDBOX.TEST_SRC ; -- 7 Rows
-- Now trying Duplicates

INSERT INTO SANDBOX.TEST_SET1
SEL * FROM SANDBOX.TEST_SRC;

-- 4 Rows Processed and remaining 3 rows were silently discarded
-- This happens since there is a whole row duplication, had there been a value change in other attributes (C2 or C3 ) the transaction would have errored for USI or PK

SEL COUNT (*) FROM SANDBOX.TEST_SET1; -- 7

INSERT INTO SANDBOX.TEST_SET2
SEL * FROM SANDBOX.TEST_SRC;

-- 4 Rows Processed and remaining 3 rows were silently discarded
-- This happens since there is a whole row duplication, had there been a value change in other attributes (C1 or C3 ) the transaction would have errored for USI or PK

SEL COUNT (*) FROM SANDBOX.TEST_SET2; -- 7

INSERT INTO SANDBOX.TEST_MSET1
SEL * FROM SANDBOX.TEST_SRC;
--- Errors out due to Duplicate Primary Key Error
-- No rows Inserted

SEL COUNT (*) FROM SANDBOX.TEST_MSET1; -- 3

INSERT INTO SANDBOX.TEST_MSET2
SEL * FROM SANDBOX.TEST_SRC;

-- Secondary Index uniqueness Violation
-- No Rows Inserted

SEL COUNT (*) FROM SANDBOX.TEST_MSET2; -- 3