Hash Index and Single Table Join Index


Hash Index and Single Table Join Index


There may be a discussion already in this forum about Hash and Join Index.

Still I am creating this post because I want to clarify few queries for my own understanding.

I am creating a HASH INDEX: here the PI of the table is (emp_id)

CREATE HASH INDEX emp_hash_indx (emp_id, emp_dept_id, emp_name)

ON tduser.employee_sample

BY (emp_dept_id)

ORDER BY HASH (emp_dept_id) ;

1. Does HASH INDEX create a sub table like SI?

2. Why HASH INDEX is used, I mean -:

   In case of Single Table JOIN INDEX on emp_dept_id with the above columns, Explain shows the similar process method.

   So why shall any one use HI instead of Single table JI?

3. Can we create compressed HASH INDEX?

4. When I am running the HELP :  I can see one extra Unique column.

    What is this column? This column is not present in Single table JI.

   HELP HASH INDEX  emp_hash_indx;

Please let me know these answers as they are really bugging me. :-)




Re: Hash Index and Single Table Join Index

Hi Shantanu,

This is what I conceive:

Hash Index has subtable

Hash index and Single table join indexes can be row-compressed, value-ordered. I am not pretty sure if row compression can be done if UDT is available

Column-partitioned single table join indexes can neither  be row-compressed nor value ordering.

Hash index cannot have PPI whereas STJI can.

Hash Index cannot be column-partitioned, however, in case of STJI it can be done.




Re: Hash Index and Single Table Join Index

Thanks Raja .. for your reply. It is really helpful.


Re: Hash Index and Single Table Join Index

1. What does the complete 32 bit hash number denote?

2.What is meant by : Table is sorted according to row id in the AMP?

3.While retreiving a row from AMP how do we get the physical address if two rows reslult in same row hash vale?If it is accessed through uniqueness number assigned then how do we get it during retrieval?

4.What happens when we are not taking all the PI columns while retreiving data in Where cluse?How is it hashed?

5.Why we cannot have UPI on table that is partitioned by column not included in PI?

6.How assigning a USI helps to handle partition on columns that is not part of PI?


Re: Hash Index and Single Table Join Index

here we go:

1- PI value is passed to a hashing algorithm and a 32 bit hash number is generated, this number is simply a mathematicaly blended value to be used to assign proper amps.

2- in all normal table, rows are sorted according to the row id. Row id is simly a combination of Row hash + uniqueness value. All the rows are sorted according to this combination on the amp.

3- When you go for retrieval, Index values is also passed to access the rows, so if there are multiple values resulted in hash collsions, then index value is used to get the correct row.

4- When we do not take all the PI columns in retrieval, PI is not used, in order to take benifit of PI access, we must use all the PI columns.

5- When you partition a table other than columns in PI, you can not define UPI on it, it is just a rule, you can define UPI to enforce uniqueness.

6- USI serves two purposes, one is to enforce uniquness, second use is to make single value access within an amp.


Re: Hash Index and Single Table Join Index


For your 4th question -

The extra unique column in hashindex is the actual basetable's rowid.

I believe this exist in Hash index to join with the base table fast  in case if the SQL refer any other columns not included in the hash join index definition.  With the help of the primary index row id it can easily use the actualy table and bit faster than STJI.


For your 5th question -

I too not sure why  to have only NUPI if table is partitioned on other columns.  

And I have one Add on question on this - Why UPI is permitted if we partition on UPI columns. Actually if the Partition column is unique in each partition there exist only one record.  then there is no benfit i think.

Can any one clarify this too.



Re: Hash Index and Single Table Join Index

Answer for 5th Question -

If PPI is not part of PI, We can not have UPI. Because, UPI requires duplicate row check for each and every rows being inserted. To do this check, it has to go to each and every partition as it does not have partition number info.

If PPI is part of PI, We can have UPI. Because duplicate row check will become easy as the partition number can be calculated, which is as part of PI.