Hi Teradata experts, I'm trying to write a query that selects a varchar column - which contains a valid hexidecimal character string - and converts it to it's byte representation.
So if I have the character string "FF" it will convert it to a byte value FF.
From the documentation it seems that the TO_BYTES function would serve my needs for this task, the problem is that it for some input values it adds an additional zero byte to the start of the result.
Here's an example: (note: the casting is just for presentation in SQL Assistant)
'7F' as val_a
,cast(to_bytes(val_a, 'base16') as varbyte(2))
,'80' as val_b
,cast(to_bytes(val_b, 'base16') as varbyte(2));
This results in:
7F 7F 80 00-80
So, converting "7F" results in a single byte result of 7F - which is good
However converting "80" results in a 2-byte result, the first byte being 00 and the second byte being 80 - which is not so good.
Why is it that even though a hex value of 80 - which should fit into a single byte - results in a two byte answer where the first byte is 00?
Any guidance would be much appreciated :) Thanks
Any byte with the high bit set (>= '80) will return a leading '00'.
This seems to be based on interpreting the string as a signed value, when you compare it to the TO_BYTE function for integers
SELECT TO_BYTE(127), TO_BYTE(128)
you will see that this returns '7F' and '00-80', too.
You might add a TRIM to get rid of it:
TRIM(LEADING '00'xb FROM CAST(TO_BYTES(val_b, 'base16') AS VARBYTE(2)))
Thanks Dieter. Is there anyway to instruct Teradata to use unsigned values?
The TRIM solution might be workable, the only problem would be in the case where I have a legitimate zero byte at the start which would get trimmed and leave me with an empty value.
If your data is of a fixed size you can use SUBBITSTR(bytes,0,length_in_bits) to select the correct number of bytes from the LSB end and removed the extra byte that teradata adds to the MSB end.
-- Teradata adds a null byte when the Most Significant Bit is 1
-- This makes no sense
-- SUBBITSTR will select the 20bytes from the LSB end to work around this
-- Convert to BYTES as this will take half the storage vs. storing HEX in VARCHAR
--TO_BYTES expects UNICODE string hash_sha1 provides LATIN
TRANSLATE(hexstring_20_bytes_long USING LATIN_TO_UNICODE),