As per one of the requirement , we are making one column as CHARACTER SET UNICODE CASESPECIFIC.
But as we know that in Unicode too we have two types of character sets UTF-8 & UTF-16.
Can we know some how which one (UTF_8 or UTF-16?) would be created after making the column as UNICODE?
Thanks in advance!
You can find a reasonable information about these two in the link below and then will be able to make a better decision;
UNICODE in Teradata is always stored as UTF-16, i.e. two bytes per character.
In 13.10 there's COMPRESS USING TransUnicodeToUTF8 DECOMPRESS USING TransUTF8ToUnicode to change internal storage to UTF-8 which mightreduce storage when most chars are latin.
Will this function help you? I ran two cases
select octet_length('Amazon Web services',UTF8) UTF8_byte,octet_length('Amazon Web services',UTF16) UTF16_byte
You can substitute your values for test.
I have seen that when using TPT, we should declare the columns which uses utf8 as size*3 and for utf16 as size*2. Can you please explain these.
So, does that mean in utf8, every latin character takes 1 byte of space and others take 3 bytes?
I follow Steve's suggestions always. I do less work in tpt. Previously I used wizard.