Teradata : Logic for Hash function (hashrow and hashbucket)


Teradata : Logic for Hash function (hashrow and hashbucket)

Problem statement

This is regarding on knowing the mathematics(logic) behind the hash function (hashrow and hashbucket) in Teradata side

Teradata SQL :

select SUM(  HASHBUCKET(       HASHROW(ColumnName)      (BYTE(4)))  / ((HASHBUCKET()+1)/65536) * CAST(65536 AS BIGINT)
+ HASHBUCKET(SUBSTR(HASHROW(ColumnName),3,2) (BYTE(4))) / ((HASHBUCKET()+1)/65536)
) AS SumHash from Table

What am I trying to achieve ?

Trying to implement hash functionality in Hive (similar to Teradata's hashrow and hashbucket functionality). So , would like to develop a UDF which can perform the same as performed in Teradata's hash function.

As, in Teradata - we have concepts of AMPs which we don't find in Hive.

Just need calculation logic of hashrow and hashbucket in Teradata, so as it gives idea if its feasible to develop UDF in Hive.

Need your advice


Tags (3)