UTF-8 as Character Set in SQL Assistant

Teradata Applications

UTF-8 as Character Set in SQL Assistant

I'm using UTF-8 as the character set when connecting with SQL Assistant but it seems to have trouble exporting the en dash. I get a SPA character or double blank instead. It works if I switch to ASCII which doesn't make much sense because I thought ASCII was a subset of UTF-8.


Anyone else experience this or have a solution?



Teradata Employee

Re: UTF-8 as Character Set in SQL Assistant

If Windows-1252 single-byte characters were (incorrectly) loaded by a session that specified or defaulted to ASCII client character set, then the Windows "en dash" would be stored essentially unchanged as x'96' in a LATIN column or U+0096 in a UNICODE column, versus the "correct" U+2013 equivalent (which could not be stored in a TD LATIN column). If you then query as ASCII, again the single-byte characters are essentially unchanged and you get back x'96'; so everything appears on the surface to be correct.


But if you ask for translation to UTF-8, the internal value is interpreted as the "Start of Guarded Area" control character. It's also interpreted as this control character internally in the collation sequence (for sorting and ordering).