We have a SET table with NUPI defined on it.I understand this is bad, as it needs to check for the duplicates by scannning the entire row set while data is loaded into the table.
The column which is defined as an NUPI is an id column which is sequencly generated, so no way this is going to be duplicated.
We are going to make this as UPI, so that it is righlty distributed across all AMPS.
Apart from the id column we do not have any other column which is to be unique...
Weare performing some performance tweaks, Can we define multi column NUPI on the table? Is this advisable? If yes how to decide which column would be apt to be selected for NUPI?
Also is it possible to define both UPI and NUPI for the table??
Appreciate your comments.
this is not bad unless there are too many rows per PI value. Duplicate row checks will be done only for roews with the same PI (in fact, the same Row Hash).
If this sequence column is not used for joins/where it's totally useless, you'll get a perfectly distributed table with a horrible performance.
Thanks a lot for your response. So you say, the NUPI defined on the id column is ok, as long as there are no too many rows for that NUPI defined correct?
Example below is ok right though its a SET table duplicates values wont be checked unles the ID column was same?
if SEQID is not used it's useless as PI column, in fact you don't need it at all.
What's the actual Primary Key of this table?
I see except the SEQID field there are none other fields which have unique values and hence no primary key.
Except this SEQID being defined as Primary Index, there is no other Index defined on this table.
Is it not suggested to create a PI on the sequence id and a NUSI on -
.This will have a proper disstibution as well as Sec index for access. Of course the queries need to be tested if they use the SI or not and then decide ? Generally what is recomended practice in such situations.
first check the FK columns (used for joins), then the columns used in Where, but defining a useless PI is the last resort.
Every relational table should have a logical Primary Key constraint, simply adding a stupid sequence still doesn't mean there's a real PK and you'll encounter duplicate rows.
And based on John's other thread Non duplicate row selection query he tries to get rid of them :-)