Partitioned primary index

Database
Fan

Partitioned primary index

Hi,

Please let me know, how do we decide on selecting a partitioned primary index.I understand that this is another index created for faster data access. Are there any constraints on selecting PPI with respect to primary index.also what are the cons of PPI ?

Thanks,
6 REPLIES
Enthusiast

Re: Partitioned primary index

Two things come straight to mind:

1) If you table is to use a UPI the PPI must include that column.
2) When a PPI is implemented, in order to avoid Full table scans, you need to limit your SQL on the PPi column.

There is loads of additional information in the Teradata manuals (my guess would be one of the DBA ones).
Also there is a great Teradata training course covering this material.

Good luck.
Enthusiast

Re: Partitioned primary index


> Hash partitioned (that is, distributed) to the AMPs by the hash value of the primary index columns
> Ordered by the hash value of the primary index columns on each AMP

> Hash partitioned to the AMPs by the hash of the primary index columns
> Partitioned on some set of columns on each AMP
> Ordered by the hash of the primary index columns within that partition

• Uses partition elimination to improve the efficiency of range searches when, for example,
the searches are range partitioned
• Provides an access path to the rows in the base table while still providing efficient join
strategies

While a table with a properly defined PPI will allow overall improvement in query
performance, certain individual workloads involving the table, such as primary index
selects, where the partition column criteria is not provided in the WHERE clause, may
become slower.
• There are potential cost increases for certain operations, such as empty table insert-selects.
• You must carefully implement the partitioning environment to gain maximum benefit.
Benefits that are the result of using PPI will vary based on:
• The number of partitions defined
• The number of partitions that can be eliminated given the query workloads, and
• Whether or not you follow an update strategy that takes advantage of partitioning.

Hope this help for now. Pls refer Teradata Manuals for complete info.
Fan

Re: Partitioned primary index

Can we add partitions to a already populated table using alter table command.The original table doesnot have partitions defined on it.I am getting teradata 3707 error on doing so.

in my case partition key is not part of primary index and primary index is non unique.

Please advice.

Thanks,
Major
Enthusiast

Re: Partitioned primary index



Can you go into more detail about what you mean here, please?
Enthusiast

Re: Partitioned primary index

Sorry about the comment above- it did not come out as expected with the preview (it dropped the part I quoted) and as a newbie it won't let me edit or delete.

Rob_Analyst_Master, you said "When a PPI is implemented, in order to avoid Full table scans, you need to limit your SQL on the PPi column."

Could you expand on what you mean here? Thanks.
Enthusiast

Re: Partitioned primary index

Generally tables with PPIs established are large tables so avoiding the full table scan will be the reason you have established the Index in the 1st place. In order to do this, when SELECTING data from the table (or for that matter any time you access the table), in order to take advantage of the PPI you must utilize the PPI's columns within your WHERE/JOIN criteria otherwise a full table scan will be required.

For example:

CREATE TABLE tblA (colA decimal(18,0), colB date, colC integer) unique primary index (colA, colB)

PARTITION BY RANGE_N(colB BETWEEN DATE '2006-01-01' AND DATE '2009-12-31' EACH INTERVAL '1' DAY , NO RANGE);

QUERY EXAMPLE 1 (NO PPI USAGE)

SELECT * FROM tblA WHERE colC = 1;

The Explain details the following in its chosen plan:

1) First, we lock tblA for access.

2) Next, we do an all-AMPs RETRIEVE step from tblA by way

of an all-rows scan with a condition of ("tblA.colC =

1") into Spool 1 (group_amps), which is built locally on the AMPs.

The size of Spool 1 is estimated with no confidence to be 52 rows.

The estimated time for this step is 0.02 seconds.

3) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of Spool 1 are sent back to the user as the result of

statement 1. The total estimated time is 0.02 seconds.

QUERY EXAMPLE 2 (USAGE OF PPI):

SELECT * FROM tblA WHERE colB =1080101 AND colC = 1;

1) First, we lock tblA for access.

2) Next, we do an all-AMPs RETRIEVE step from a single partition of

tblA with a condition of ("tblA.colB = DATE '2008-01-01'") with a residual condition of (

"(tblA.colC = 1) AND (tblA.colB = DATE '2008-01-01')") into Spool 1 (all_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to

be 512 rows. The estimated time for this step is 0.02 seconds.

3) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of Spool 1 are sent back to the user as the result of

statement 1. The total estimated time is 0.02 seconds.

In both explain outputs, please look at step 2 - the 1st example states it will need to perform a FULL TABLE SCAN whilst the second example (restricting the partitions required to find the result), it tells you the number of PARTITIONS it will need to access in order to find your data values.

Full information on PPIs is available from the manuals and the numerous Teradata training courses/manuals out there.

Good luck.