UTF -8 or UTF-16?

Database
Enthusiast

UTF -8 or UTF-16?

Hi Experts,

As per one of the requirement , we are making one column as CHARACTER SET UNICODE CASESPECIFIC.

But as we know that in Unicode too we have two types of character sets UTF-8 & UTF-16.

Can we know some how which one  (UTF_8 or UTF-16?) would be created after making the column as UNICODE?

Thanks in advance!

Cheers!

Nishant

5 REPLIES
Enthusiast

Re: UTF -8 or UTF-16?

Hi Nishant,

You can find a reasonable information about these two in the link below and then will be able to make a better decision;

http://www.differencebetween.net/technology/difference-between-utf-8-and-utf-16/

Khurram
Senior Apprentice

Re: UTF -8 or UTF-16?

Hi Nishant,

UNICODE in Teradata is always stored as UTF-16, i.e. two bytes per character.

In 13.10 there's COMPRESS USING TransUnicodeToUTF8 DECOMPRESS USING TransUTF8ToUnicode to change internal storage to UTF-8 which mightreduce storage when most chars are latin.


Enthusiast

Re: UTF -8 or UTF-16?

Hi Nishant,

Will this function help you? I ran two  cases 

select octet_length('Amazon Web services',UTF8) UTF8_byte,octet_length('Amazon Web services',UTF16) UTF16_byte

You can substitute your values for test.

Cheers,

Raja

Enthusiast

Re: UTF -8 or UTF-16?

Hi All,

I have seen that when using TPT, we should declare the columns which uses utf8 as size*3 and for utf16 as size*2.  Can you please explain these.

So, does that mean in utf8, every latin character takes 1 byte of space and others take 3 bytes?

Enthusiast

Re: UTF -8 or UTF-16?

http://forums.teradata.com/forum/tools/charn-char-set-unicodelatin-define-schema-in-14-xx-multiply-b...

I follow Steve's suggestions always. I do less work in tpt. Previously I used wizard.