Indexing - How many AMPS are involved in the join

Database
New Member

Indexing - How many AMPS are involved in the join

Hi There,

 

If I have two tables, A& B, both with primary index CUSTOMER_ID. A.Row1 has CUSTOMER_ID 123 and B.Row1 has a CUSTOMER_ID 123. Will both records be stored in the same AMP?

 

Nair

4 REPLIES
rjg
Supporter

Re: Indexing - How many AMPS are involved in the join

I can't think of an instance where they would not based on the info you have provided.

you can run this on each table to demonstrate.

 

SELECT pi_col,HASHAMP(HASHBUCKET(HASHROW(a))) AS AMPNumber
FROM your_table where pi_col = 123; 
rjg
Supporter

Re: Indexing - How many AMPS are involved in the join

Not sure how to fix last post so correct:

 

SyntaxEditor Code Snippet

SELECT pi_col,HASHAMP(HASHBUCKET(HASHROW(pi_col))) AS AMPNumber
FROM your_table where pi_col = 123;
Senior Supporter

Re: Indexing - How many AMPS are involved in the join

Hi Nair,

Whether two rows are stored on the same AMP or not is dependent on a combination of the data value and data type.

 

In your example, if in both tables the CUSTOMER_ID column has the same (or compatible) data type then the two rows will ALWAYS be stored on the same AMP (at least until TD 16.10 but that's a 'maybe' and another story).

 

By 'compatible' data types I mean 'compatible for hashing'. So the same data value in INTEGER and SMALLINT columns will generate the same row hash, therefore they will end up on the smae AMP.

 

However, the same data value in an INTEGER column and a FLOAT column will generate different row hash values and therefore will typically end up on different AMPs.

 

Cheers,

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com

Re: Indexing - How many AMPS are involved in the join

Hello,

The storage is entirely dependent on the hashing function of the teradata. This in turn is governed primarily by the value of that index and the data type. In your case, we already know the data type is same, so if the data type is same , the join would happen on the same amp.

To summarize : "If the data type and value of 2 tables in same, and the join is carried out on the same PI, it would be performed locally on the amp itself."

 

Cheers!