Supporting SHA-2 hashing using a JAVA Table Operator

Extensibility
Extensibility covers the mechanisms by which you, as the user or developer, can extend the functionality of the Teradata Database, for example with the use of User Defined Functions, or UDFs.
Teradata Employee

Supporting SHA-2 hashing using a JAVA Table Operator

Overview

This article provides an example of implementing a java table operator. The specific use case is support for the SHA-2 family of hash encodings, for more details on SHA-2 see https://en.wikipedia.org/wiki/SHA-2

Background on why a JAVA table operator was chosen as the implementation mechanism for SHA-2.

    1. The default software installation of the Teradata nodes does not have either the Perl nor Python required support libraries for SHA-2. Therefore the script table operator is not a good implementation mechanism.
    1. Hive implemented a java scalar UDF to support SHA-2 with the following interface:  string sha2(string/binary, int). The scalar UDF requires a per row context and currently Teradata scalar UDFs do not support this capability. Therefore scalar UDFs are not a good implementation mechanism. For more Hive details see https://issues.apache.org/jira/browse/HIVE-10644
    1. The default software installation of the Teradata nodes does not have the required C/C++ header fields to support using the libcrypto library. Therefore C/C++ is not a good implementation mechanism.

 

Therefore the selected implementation mechanism uses a table java table operator with an interface similar to hives scalar UDF. The table operator uses the same standard encoding library as Hive, import java.security.MessageDigest

Example usage and interface:

CREATE TABLE t1 (
pkey INTEGER
,mystr VARCHAR(1024)
,fill1 INTEGER
,fill2 BIGINT
)

INSERT INTO t1 VALUES (1, 'ABC', 2, 1100)
SELECT * FROM sha2 (
ON (SELECT * FROM t1)
USING SHA_LENGTH(256) SHA_COLUMN('mystr')
) AS d

result set:

 

Pkey

mystr

fill1

fill2

Shahash

1

1

ABC

2

1100

B5D4045C3F466FA91FE2CC6ABE79232A1A57CDF104F7A26E716E0A1E2789DF78

Syntax elements:

    • ON Clause: Any supported table operator select statement or table reference. CLOBs and BLOBs are not supported.
    • USING clause Name Value Pairs:
    • SHA_LENGTH: optional clause which defines the desired byte length of the result. Valid values are 256, 384, 512, or 0 (which is equivalent to 256). If this clause is missing the default value is 256.
    • SHA_COLUMN is the VARBYTE or VARCHAR column to encode. All other columns are passed through the operator.
    • Output Schema: Input schema plus a VARCHAR(512) CHARACTER SET LATIN column containing the SHA hash value.

Implementation

The java table operator is very simple and mainly handles preparing the input parameters for the call to the MessageDigest methods, below is the main processing loop:

    digest = MessageDigest.getInstance("SHA-" + shaLen);            
/* Copy input data to output adding a sha hash column. */
while (rsin[0].next()) {
for(int i=1;i<=colcount;i++) {
Object o = rsin[0].getObject(i);
if(rsin[0].wasNull())
rsout[0].updateObject(i, null);
else
rsout[0].updateObject(i, o);
}
/* Add hash value */
digest.reset();
if (shatype == TeradataType.VARCHAR_DT) {
String inStr = (String) rsin[0].getObject(shaColumn);
if(rsin[0].wasNull())
throw new SQLException("Error SHA column is NULL.");
digest.update(inStr.getBytes(), 0, inStr.length());
} else {
byte[] b = (byte[]) rsin[0].getObject(shaColumn);
if(rsin[0].wasNull())
throw new SQLException("Error SHA column is NULL.");
digest.update(b, 0, b.length);
}
byte[] b = digest.digest();
String result = DatatypeConverter.printHexBinary(b);
rsout[0].updateObject(colcount+1, result);
rsout[0].insertRow();
} /* while */

Testing:

Six of the data files from here, http://csrc.nist.gov/groups/STM/cavp/secure-hashing.html#sha-2, were used to test the implementation.

SHA256LongMsg.rsp

SHA256ShortMsg.rsp

SHA384LongMsg.rsp

SHA384ShortMsg.rsp

SHA512LongMsg.rsp

SHA512ShortMsg.rsp

Installation:

Dependencies: JRE 1.5 or higher execution environment on the Teradata nodes and the required Teradata database permissions. The attached installation JAR was built with a 1.5 target which should support all Teradata database releases. To install the JAR and create the table operator use the following DDL.

GRANT EXECUTE PROCEDURE ON SQLJ to ...;

GRANT CREATE FUNCTION ON ... TO ...;

CALL sqlj.install_jar('CJ!.\sha2.jar','sha2j',0);

CREATE FUNCTION sha2() RETURNS TABLE VARYING USING FUNCTION sha2_contract LANGUAGE JAVA NO SQL PARAMETER STYLE SQLTABLE EXTERNAL NAME 'sha2j:sha2.execute'

Command line compile commands to create the input JAR

javac -target 1.5 -source 1.5 -classpath .;.\javFnc.jar sha2.java

jar -cf sha2.jar sha2.class sha2$1.class

Attachments:

Contained within the zip installation file:

sha2.java: file containing the table operator java source implementation.

sha2.jar: file containing the table operator runtime implementation.