Say Yes to No Primary Index (No PI) Tables

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Teradata Employee

Say Yes to No Primary Index (No PI) Tables

The purpose of the new Teradata 13.0 feature that allows you to create tables with no primary index is improved performance of FastLoad and Teradata Parallel Data Pump Array INSERT data loading operations.

Because there is no primary index for the rows of a NoPI table, its rows are not hashed to an AMP based on their primary index value. Instead, Teradata Database either hashes on the Query ID for a row, or it uses a different algorithm to assign the row to its home AMP once it reaches the AMP onto which it has been FastLoaded. The Teradata Database then generates a RowID for each row in a NoPI table by randomly selecting an arbitrary hash bucket that an AMP owns and using it to generate a RowID.

The result is faster and more efficient data loading.

Uses for NoPI Tables

NoPI tables are particularly useful as staging tables for bulk data loads. When a table has no primary index, its rows can be dispatched to any given AMP arbitrarily, so the system can load data into a staging table faster and more efficiently using FastLoad or Teradata Parallel Data Pump Array INSERT operations.

By storing bulk loaded rows on any arbitrary AMP, the performance impact for both CPU and I/O is reduced significantly. After having been received by Teradata Database, all of the rows can be appended to a NoPI table without being checked for duplicates or needing to be redistributed to their hash-owning AMPs.

Because there is no requirement for such tables to maintain their rows in any particular order, the system need not sort them. The performance advantage realized from NoPI tables is achieved optimally for applications that load data into a staging table, which must first undergo a conversion to some other form, and then be redistributed before they are stored in a secondary staging table.

Using a NoPI table as a staging table for such applications avoids the row redistribution and sorting required for primary-indexed staging tables. Another advantage of NoPI tables is that you can quickly load data into them and be finished with the acquisition phase of the utility operation, which frees client resources for other applications.

NoPI tables are also useful as so-called sandbox tables when an appropriate primary index has not yet been defined for the primary-indexed table they will eventually populate. This use of a NoPI table enables you to experiment with several different primary index possibilities before deciding on the most optimal choice for your particular application workloads.

Create Table with NoPI Syntax and Default Settings

The syntax for the CREATE TABLE statement has been changed to permit user data tables to be created without a primary index. Such tables are referred to as NoPI (No Primary Index) Tables. A NoPI table can be created in several different ways:

• When the newly supported NO PRIMARY INDEX clause is explicitly specified in the CREATE TABLE statement as shown in the example below, table Sales_Temp1 will be created as a NoPI table.

CREATE MULTISET TABLE Sales_Temp1, FALLBACK,

NO BEFORE JOURNAL,

NO AFTER JOURNAL,

CHECKSUM = DEFAULT

(

ItemNbr INTEGER NOT NULL,

SaleDate DATE FORMAT 'MM/DD/YYYY' NOT NULL,

ItemCount INTEGER)

NO PRIMARY INDEX;

• When the newly supported NO PRIMARY INDEX clause is explicitly specified in the CREATE TABLE AS statement as shown in the example below, table Sales_Temp2 will be created as a NoPI table. In this example, table Sales acting as the source table can be a PI table or a NoPI table.

CREATE MULTISET TABLE Sales_Temp2 AS (SELECT * FROM Sales)

WITH DATA NO PRIMARY INDEX;

• When none of PRIMARY INDEX, NO PRIMARY INDEX, PRIMARY KEY and UNIQUE Constraint is specified in the CREATE TABLE statement as shown in the example below, table Sales_Temp3 will be created based on the setting of the new DBSControl General Field, PrimaryIndexDefault. If PrimaryIndexDefault is set to N, Sales_Temp3 will be created as a NoPI table.

CREATE MULTISET TABLE Sales_Temp3, FALLBACK,

NO BEFORE JOURNAL,

NO AFTER JOURNAL,

CHECKSUM = DEFAULT

(

ItemNbr INTEGER NOT NULL,

SaleDate DATE FORMAT 'MM/DD/YYYY' NOT NULL,

ItemCount INTEGER);

• When neither PRIMARY INDEX nor NO PRIMARY INDEX is specified in the CREATE TABLE AS statement as shown in the example below, table Sales_Temp4 will be created based on the setting of PrimaryIndexDefault. If PrimaryIndexDefault is set to N, Sales_Temp4 will be created as a NoPI table. In this example, table Sales acting as the source table can be a PI table or a NoPI table.

CREATE MULTISET TABLE Sales_Temp4 AS (SELECT * FROM Sales) WITH DATA;

The new DBSControl General Field, PrimaryIndexDefault, was introduced to control whether a table is created with or without a primary index when none of PRIMARY INDEX, NO PRIMARY INDEX, PRIMARY KEY and UNIQUE Constraint is specified. The setting of PrimaryIndexDefault is as follows:

D - This is the default setting. This setting works the same as the P setting.

P - The first column in the table will be selected as the non-unique primary index. This setting works the same as that in the past when PRIMARY INDEX was not specified.

N – The table will be created without a primary index (NoPI table).

With the NoPI Table feature, the system default setting essentially remains the same as that in previous Teradata releases where the first column was selected as the non-unique primary index when the user did not specify a PRIMARY INDEX or a PRIMARY KEY or a UNIQUE Constraint. Users can change the default setting for PrimaryIndexDefault to P or N and not rely on the system default setting which can possibly be changed in a future release.

To change from a NoPI table to PI table, users can run one of the following:

• CREATE TABLE AS - The example shown below will create table Sales_Temp_PI with primary index ItemNbr and all of the data from NoPI table Sales_Temp1.

CREATE MULTISET TABLE Sales_Temp_PI AS (SELECT * FROM Sales_Temp1) WITH DATA PRIMARY INDEX (ItemNbr);

• CREATE TABLE and INSERT-SELECT - The example shown below will create table Sales_Temp_PI with primary index ItemNbr and then copy over all of the data from NoPI table Sales_Temp1.

CREATE MULTISET TABLE Sales_Temp_PI, FALLBACK,

NO BEFORE JOURNAL,

NO AFTER JOURNAL,

CHECKSUM = DEFAULT

(

ItemNbr INTEGER NOT NULL,

SaleDate DATE FORMAT 'MM/DD/YYYY' NOT NULL,

ItemCount INTEGER)

PRIMARY INDEX (ItemNbr);

INSERT INTO Sales_Temp_PI SELECT * FROM Sales_Temp1;

Users can also change from a PI table to a NoPI table using one of the above approaches.

Rules and Limitations for NoPI Tables

The rules and limitations for NoPI tables are the same as those for primary-indexed tables with the following exceptions:

• You cannot create a NoPI queue table.

• You cannot create a NoPI error table.

• You cannot create a NoPI table as a SET table. The default table type for NoPI tables in both Teradata and ANSI session modes is always MULTISET.

• If neither PRIMARY INDEX (column_list) nor NO PRIMARY INDEX is specified explicitly in a CREATE TABLE request, then whether the table is created with or without a primary index generally depends on the setting of the DBS Control flag DisableNoPI

• NoPI tables cannot specify a permanent journal.

• NoPI tables cannot specify an identity column.

• Hash indexes cannot be defined on NoPI tables because they inherit the primary index of their underlying base table, and NoPI tables have no primary index.

• SQL UPDATE triggers cannot update a NoPI table.

• SQL UPDATE and UPDATE (Upsert Form) requests cannot update a NoPI target table. SQL UPDATE and UPDATE (Upsert Form) requests can update a primary-indexed target table from a NoPI source table.

• SQL MERGE requests cannot update or insert into a NoPI target table.

SQL MERGE requests can update or insert into a primary-indexed target table from a NoPI source table.

• You cannot load rows into a NoPI table using the MultiLoad utility.

You can define all of the following commonly used table features for NoPI tables:

• Fallback

• Secondary indexes

• Join indexes

• UNIQUE column constraints

• CHECK constraints

• PRIMARY KEY and FOREIGN KEY constraints

• Triggers

• BLOB and CLOB columns.

You can define any of the following table types as NoPI tables:

• Ordinary base data tables

• Global temporary tables

• Volatile tables

You cannot define any of the following table types as NoPI tables:

• Queue tables

• Error tables

• Global temporary trace tables6

Manipulating NoPI Table Rows

After a NoPI staging table has been populated, you should run one of the following types of DML request to move the NoPI staging table source rows to a primary-indexed target table:

• INSERT … SELECT

• MERGE

• UPDATE … FROM

For example, you can populate data into a NoPI table via single-statement INSERT or multi-statement INSERT. The latter can be done with TPump Array INSERT which is more efficient and also an alternative to FastLoad for staging a table.

The example below shows a single-statement INSERT into NoPI table Sales_Temp1:

INSERT INTO Sales_Temp1 (100, '11/03/2008', 10);

INSERT-SELECT into a NoPI target table is also allowed. The source table can be a PI table or a NoPI table.

INSERT INTO Sales_Temp1 SELECT * FROM Sales;

You can use the following DML statements to manipulate or retrieve NoPI table rows prior to

moving them to their primary-indexed target table:

• DELETE

• INSERT

• SELECT

You cannot use UPDATE or UPDATE (Upsert Form) requests to change data in a NoPI table.

The following is a list of other operations and utilities and how they operate on a NoPI table:

Feature Impact

UPSERT - Not supported. This is a composite of UPDATE and INSERT operations such that if the UPDATE fails because the target row does not exist, the INSERT is automatically executed. The UPDATE part of an UPSERT is required to fully specify the primary index. Since there is no primary index in a NoPI table, UPSERT is not supported.

MERGE-INTO - Not supported when the target table is NoPI. This is currently composed of three variant forms – UPSERT, UPDATE-only and INSERT-only. UPSERT is not supported for the same reason discussed above. UPDATE is restricted in the initial release and therefore the UPDATE-only form of MERGE-INTO is not supported. INSERT-only is also not supported but it is very much like INSERT-SELECT which is supported.

MultiLoad - Not supported. FastLoad and TPump are utilities that are normally used to stage a table.

Restore/Copy Supported. The Restore/Copy processing on a NoPI table is very similar to that on a PI table. The main difference is that a NoPI table normally has one hash bucket per AMP. This is like a very non-unique NUPI table. So when the data from a NoPI table is restored or copied to a different configuration with more AMPs, there may be some AMPs that will not have any data at all. So, Restore/Copy can cause data skewing for a NoPI table.

FastExport - Supported. Data in a NoPI table can be FastExported the same way as for a PI table.

CheckTable - Supported. The LEVEL HASH check which is done with either an explicit LEVEL HASH command or implicitly in a LEVEL THREE command works differently on a PI table and a NoPI table. On a PI table, this check regenerates the rowhash for each data row based on the primary index and then compares with the rows on disk. On a NoPI table, this check looks at the rowhash for each data row and verifies that the hash bucket that is part of the rowhash correctly belongs to the AMP.

Table Rebuild - Supported. Table Rebuild processing on NoPI table is the same as that on a PI table. The table has to have fallback protection for Table Rebuild to run. Rows in a NoPI table have rowhash and therefore can be rebuilt the same way as those on a PI table.

Reconfig  - Supported. The Reconfig processing on a NoPI table is very similar to that on a PI table. The main difference is that a NoPI table normally has one hash bucket per AMP. This is like a very skewed non-unique NUPI table. So when a system is reconfigured to have more AMPs, there may be some AMPs that will not have any data at all for a NoPI table. So, Reconfig can cause data skewing for NoPI table.

Additional Information on No PI Tables

Refer to the following documention and Teradata Database Orange Book for information about, and usage suggestions for, tables with no primary index.:

Teradata Orange Book: No Primary Index (NoPI) Table

Teradata 13.0 Database Design publication, B035-1094-098A, located at:

http://www.info.teradata.com/DataWarehouse/eTeradata-BrowseBy-Results.cfm?pl=&PID=&title=%25&release...

13 REPLIES
Teradata Employee

Re: Say Yes to No Primary Index (No PI) Tables

Smazingo, thanks for your article. It helped me a lot to understand NoPI tables. However, I still have some questions:

1. NoPI tables are distributed randomly among all AMPs. Should I still expect some skew on them? Would it be a low skew or high one?

2. If I create a table with PRIMARY KEY or UNIQUE constraint, but I don't specify PRIMARY INDEX nor NO PRIMARY INDEX, how is the table created? Would it have a PRIMARY KEY/UNIQUE field as PRIMARY INDEX, or it would depend of the value of PrimaryIndexDefault variable?

3. Does transient journal work for NoPI tables? Can I rollback transactions done on NoPI tables?

4. How are Secondary Indexes hashed on NoPI tables? How is data accessed through Secondary Indexes on this kind of tables?

5. When data need to be accessed/modified in a NoPI table, does optimizer performs a full table scan always?

6. Can partitions be created on a NoPI table?

7. Do you know when demo for Teradata 13 will be available to do some tests?
Teradata Employee

Re: Say Yes to No Primary Index (No PI) Tables

in response to comment by Celia. here are the answers. btw, i am checking on a couple of them to make them accurate and will follow up to make changes if necessary. steve

Smazingo, thanks for your article. It helped me a lot to understand NoPI tables. However, I still have some questions:
1. NoPI tables are distributed randomly among all AMPs. Should I still expect some skew on them? Would it be a low skew or high one?
The random generator that is used to determine where to store a set of rows in a NoPI table in the AMP is designed in such as way that data will be balanced out eventually
2. If I create a table with PRIMARY KEY or UNIQUE constraint, but I don't specify PRIMARY INDEX nor NO PRIMARY INDEX, how is the table created? Would it have a PRIMARY KEY/UNIQUE field as PRIMARY INDEX, or it would depend of the value of PrimaryIndexDefault variable?
if you do not specify either an explicit PRIMARY INDEX clause or an explicit NO PRIMARY INDEX clause, Teradata Database creates the table with a unique primary index if, and only if, it is also defined with either a PRIMARY KEY constraint or a UNIQUE column constraint. The system creates the UPI on the column set defined for the PRIMARY KEY or
UNIQUE constraint according to Stages 1 and 2 of the primary index defaults table as documented in the database design document chapter on primary indexes.
Otherwise, if no column set is defined as the PRIMARY KEY for the table, or if no column is defined with a UNIQUE attribute, then the table is created with no primary index when PrimaryIndexDefault is set to N.
3. Does transient journal work for NoPI tables? Can I rollback transactions done on NoPI tables?
Changes will continue to be transient journaled as done today.

4. How are Secondary Indexes hashed on NoPI tables? How is data accessed through Secondary Indexes on this kind of tables?
NoPI tables, because specifying a USI in a request is the only way to access a single row in a NoPI table. Be aware that secondary indexes can slow the loading of rows into a NoPI table using Teradata Parallel Data Load Array INSERTs.
Note, too, that Teradata Database always explicitly changes the SQL create text for a NoPI table that defines any PRIMARY KEY or UNIQUE constraint specifications to explicit unique,secondary index specifications, as you can see if you submit a SHOW TABLE request for such a table.
5. When data need to be accessed/modified in a NoPI table, does optimizer performs a full table scan always?
secondary indexes are allowed so users can avoid a full-table scan by accessing the data in a NoPI table through a secondary index. Execution plans on the table can be viewed by running an EXPLAIN just like on a regular PI table.
6. Can partitions be created on a NoPI table?
> The following rules apply for a NoPI table
– Allowed
– FALLBACK
– Secondary indexes (unique and non-unique)
– Join and reference indexes
– LOB’s
– SELECT, INSERT and DELETE
– INSERT and DELETE trigger actions
– Primary key and foreign key constraints
– Global Temp or Volatile
– Not allowed
– UPDATE and UPDATE trigger actions (will be supported in a later release)
– UPSERT and MERGE-INTO
– MultiLoad
– Queue table
– Error table
– Permanent journal
– Partitioned primary index
– ID Column
– Hash index
– SET table

7. Do you know when demo for Teradata 13 will be available to do some tests? Yes, I believe august 2009 is the target date.
Teradata Employee

Re: Say Yes to No Primary Index (No PI) Tables

ADDING to question 4 above for Celia. Secondary indexex are hashed on NoPI tables the same way that they are on PI tables. Data is accessed through secondary indexes on NoPI table the same way that it is on PI tables. That can be done because each row in a NoPI table has a system generated hashcode.

Teradata Employee

Re: Say Yes to No Primary Index (No PI) Tables

Thanks Steve.
Teradata Employee

Re: Say Yes to No Primary Index (No PI) Tables

Hmm, how are partitions allowed if sorting of the rows is avoided?
Teradata Employee

Re: Say Yes to No Primary Index (No PI) Tables

Sorry for my last question, I misunderstood something.
Enthusiast

Re: Say Yes to No Primary Index (No PI) Tables

I have a query :- NoPI is Fast as we do not have to Redistribute/sort the data and it save a lot of time during the Fastload , i have a question that in which scenario of staging table we will use NoPI?
The time is saved while loading the staging table but still we need to do the sorting/redistribution of the rows to load the base table according its Primary Index (if we are loading whole data from the staging table to basetable).
I think the time which we saved in the Fastload by eliminating the sorting will be used while redistributing the data while loading the base table.
Please provide your valuable inputs.

Re: Say Yes to No Primary Index (No PI) Tables

Smazingo, Please respond to above question.
Teradata Employee

Re: Say Yes to No Primary Index (No PI) Tables

Hi, what should I do to connect to teradata trough ERwin and retrieve the entire model from there?
Erwin 7.2 does not retrieve any nopi tables.
thanks.