As per one of the requirement , we are making one column as CHARACTER SET UNICODE CASESPECIFIC.

But as we know that in Unicode too we have two types of character sets UTF-8 & UTF-16.

Can we know some how which one  (UTF_8 or UTF-16?) would be created after making the column as UNICODE?

You can find a reasonable information about these two in the link below and then will be able to make a better decision;



UNICODE in Teradata is always stored as UTF-16, i.e. two bytes per character.

In 13.10 there's COMPRESS USING TransUnicodeToUTF8 DECOMPRESS USING TransUTF8ToUnicode to change internal storage to UTF-8 which mightreduce storage when most chars are latin.

Will this function help you? I ran two  cases 

select octet_length('Amazon Web services',UTF8) UTF8_byte,octet_length('Amazon Web services',UTF16) UTF16_byte

You can substitute your values for test.



I have seen that when using TPT, we should declare the columns which uses utf8 as size*3 and for utf16 as size*2.  Can you please explain these.

So, does that mean in utf8, every latin character takes 1 byte of space and others take 3 bytes?

