Modeling for NUPI and USI combination

Data Modeling

Modeling for NUPI and USI combination

Hi,

I think that for insert intensive I will use NUPI and USI combination. I am not using UPI. But the columns chosen for the UPI of a table are frequently the same columns identified as the PK during the data modeling process. Can you please suggest me?

Thanks and regards,

Raja

4 REPLIES
N/A

Re: Modeling for NUPI and USI combination

Hi Raja,

choosing the PI is mainly based on joins (plus WHERE-conditions, but these can be covered by a secondary index, too)

Only if the PK is actually the most frequent join path you should implement it as UPI.

If you create a NUPI a USI on the PK is only neccessary when you actuall use it in WHERE-conditions. But if it's only because of the logical correctness of you data you don't need it, this should be maintained during load.

Dieter

Re: Modeling for NUPI and USI combination

As UPI are can not be created on the PK column how to maintain the uniqness of the records as per LDM ??

I guess you are trying to create partition  table with PK column not being the partitioning column.

 USI can be defined for partitioned table to maintain the uniquess of the records.

Creating  NUPI and USI on the PK column would help maintain uniqness of the record within a partition.and may also establish the referential integrity

Teradata Employee

Re: Modeling for NUPI and USI combination

Using a USI to ensure uniqueness is not neccesarily the way to go.  Suppose your model has a PK specified that will give uniqueness, but your actual table has rows that are end dated so a new version of that row can be inserted.  That requires no unique index on the PK. Also, Soft Referential Integrity (Soft RI) is an alternative.

Teradata Employee

Re: Modeling for NUPI and USI combination

When choosing your PI, as Dieter has stated, your primary goal is to optimize access moving forward, keeping in mind both its role in data distribution across the platform and how it may impact load processing.  UPI's on PK's are fairly common on other database platforms but are often poor choices on Teradata since any access by PI would require all columns.  NUPI's are usually the best possible choice, using a column that will be used frequently in access and join processing.  When moving from an LDM to a PDM you should look within your subject area to find shared identifiers across your various tables (like an Account ID) that is a column within the PK of most/all of the tables.  Choosing a column like this means that all joins between tables in that SA will be AMP local (no row redistribution required).  And if you are worried about minor skewing on a table or two remember that when you go to join the tables the data will first be redistributed by that column that caused the skewing (i.e. your spool is skewed) every time, so measure the cost in lost table space due to the skewing against the cost of spook skewing every time a join is performed.

Specifying a USI on an "insert intensive table" may be problematic as secondary indexes immediately eliminate your ability to use some load utilities.  They must also be maintained as a part of the insert, which will impact how long that may take.  As Lee suggests, if the only goal of the USI is to enforce the PK then there are likely better alternatives, including soft RI and enforcing it as a part of the load process.  I almost never suggest, let alone specify a secondary index until I have data and access demographics.  Otherwise it's just a best guess as to whether the Optimizer will ever choose to use it.  Often times it's best to see how the table performs with the PI only and then look to optimize performance with Secondary and/or Join indexes as necessary.