I have been successfully extracting non-ASCII characters from Teradata V2R6 on Windows using the ODBC 220.127.116.11 driver with the session character set UTF8. After upgrading to Teradata 12 and ODBC v12 all non-ASCII characters are getting corrupted even though the character set in the ODBC DSN is set to UTF8. Choosing UTF16 has the same problem. The problem occurs back to at least ODBC 18.104.22.168 and it makes no difference whether I connect to Teradata 12 or V2R6.
The ODBC 12 user guide says that "Teradata UTF8 and UTF16 session character sets on Windows work with the Translation DLL selected by the Translation DLL= keyword". I assume that refers to an entry in the odbc.ini file, but it doesn't describe what the options are. There are also fields in the ODBC DSN for "Translation DLL Name" and "Translation Option", but the user guide doesn't say what DLL to use.
Is the translation DLL included in the ODBC installation? If not, where do I find one to translate to UTF8?
If your session-character-set is UTF8, all the character data ODBC driver receiving from database will be in UTF8. Later, it depends on how your application is fetching that data from ODBC.
- If application is fetching data to ANSI buffers i.e., SQL_C_CHAR, then the UTF8 data will be converted to a code page matching your windows locale.
- if application is fetching data to Unicode buffers, i.e., SQL_C_WCHAR, then the UTF8 data will be converted to UTF16.
Remember you cannot get UTF8 data into your application buffers(on MS Windows). This UTF8 pass-thru(to SQL_C_CHAR buffers) was disabled when Teradata ANSI driver became Unicode driver i.e., starting version 22.214.171.124. Look at "UTF8 Pass Through Functionality" section in user guide under "International Character Set Support".