We have a set table with NUPI and is partitioned with date. My question is when we try inserting a new record based on row hash it will perform duplicate checking or it will go to particular partition and perform duplicate check?
Ex: in one AMP
Primary Index(ex: ID1): 10
Partition 1 with date 2015-01-01: 100 rows
partition 2 with date 2015-01-02: 200 rows
Now if i try to Insert ID1: 10 and date 2015-01-02 then it will perform duplicate check on entire 300 rows or only second partition(i.e 200 rows) based on PPI column?
can somebody please clarify on this and possible point me to the TD document as well?
For two rows to be exact duplicates, they would have to have the same partitioning column(s) values too. That means optimizer would need to check rows with the same rowhash in only one partition.
Thanks for your response.
So Based on PI(ID1: 10) row hash will be generated and it know which amp it has to go,After identifying the Amp according to your answer it will go to particular partition based on partition column and then start comparing the row hash within the partition to generate a uniqueness value?
Step1: row existed with 1234 is row hash and 0002 is unique ness value in partition 1
step 2: row existed with 1234 is row hash and 0001 is unique ness value in partition 2
Now new row comes with 1234 row hash and with partition column of "partiton 2" then it only checks partition2 and it will generate uniqueness value as 0002 again(already existed in partition 1)? partition number,row_id will make the row to be unique ,this is what my understanding. Please correct me in case of any deviation
Yes. The uniqueness values exist just to make ROWKEY unqiue, which is partition+rowhash+unique. So it is possible for the same rowhash value to have same uniqueness value as long as they are in different partitions.