Chinese character are not displaying properly

Connectivity

Chinese character are not displaying properly

I am using Teradata ODBC 13.0 with php and session is set to UTF16 but not able to display them properly on php page. Is there an issue with Teradata ODBC while converting Unicode characters.

The problem was persisted even I used jdbcodbc. It worked fine with Type 4 jdbc driver. 

Please Suggest 

5 REPLIES
Teradata Employee

Re: Chinese character are not displaying properly

An ODBC application can retrieve Unicode (SQL_C_WCHAR) and/or non-Unicode (SQL_C_CHAR) characters even when the Session Character is set to UTF16.

My recommendation is to:

1- Always try the latest and greatest ODBC Driver for Teradata. It can connect to the Teradata Database 13.0 and we support the  Combination (e.g. ODBC Driver 14.0 and Teradata Database 13.0). This will show whether we fixed a bug in a later release of the ODBC Driver.  

2- Turn on ODBC Trace for a very small test application. It will show whether the application is retrieving Unicode (SQL_C_WCHAR) or non-Unicode (SQL_C_CHAR) characters. 

Is this Application running on a Windows or Linux/Unix platform?  

You must set IANAAppCodePage if the application is retrieving Non-Unicode/ANSI (SQL_C_CHAR) characters on Linux/Unix platforms (See ODBC Driver for Teradata User Guide). There is also the concept of Unicode-Encoding on Linux/Unix platforms. It defaults to UTF8 but can be changed to UTF16; See SQL_ATTR_APP_UNICODE_TYPE in the ODBC Driver for Teradata User Guide.

On Windows platforms the Application-Code-Page must be set to Chinese if the application is retrieving ANSI characters (SQL_C_CHAR).

Teradata Employee

Re: Chinese character are not displaying properly


A colleague searched the PHP source code for SQL_C_WCHAR; he found no hits. It is possible that PHP binds and retrieves the data as SQL_C_CHAR (i.e. Non-Unicode). If this assumnption is correct, then you must set IANAAppCodePage on Linux/Unix platforms or you must set the Windows Application Code Page on Windows platforms. 

Teradata Employee

Re: Chinese character are not displaying properly

One more option, set the session character set to TCHBIG5_1R0 or SCHGB2312_1T0. Again I am assuming PHP binds/retrieves character data as SQL_C_CHAR. 

Re: Chinese character are not displaying properly

I am using odbc 14.0 and tried the options to set the session character set to  TCHBIG5_1R0 or SCHGB2312_1T0    but still i can't get the chinese character data. Though I am getting English character data. I also tried odbc14.0 connected to java application , there also I am not getting chinese data.

If  I am using jdbc type 4 driver then chinese data is coming perfectly. Is there problem with ODBC driver for 14.0 ?

Teradata Employee

Re: Chinese character are not displaying properly

 We have lots of customers in China. Therefore it is very unlikely that the ODBC Driver for Teradata (13.0 and 14.0) has issues with Chinese characters.

However there are lots of different variables to consider and configure. For example platform (Windows vs. or Linux/Unix), locale , HTML page encoding, application (PHP or JDBC-ODBC Bridge) specific configurations and etc.

I do not have experience with PHP but I have lots of experience with the ODBC Driver for Teradata. Therefore I recommend that you start with a known entity like the "Teradata SQL Assistant".  The Teradata SQL Assistant can be configured to display Chinese characters even on a Windows workstation with English Locale settings. This test will validate that the ODBC Data Source is configured correctly; then you can focus on PHP/other settings.  You can also try to isolate the layers in your application; for example take a Chinese character from the TCHBIG5_1R0 Session Character Set (http://www.info.teradata.com/templates/eSrchResults.cfm?rdsort=Title&todt=&srtord=Asc&prodline=all&t...) and look at the byte sequence you receive from the ODBC Driver in PHP when the Session Character Set is TCHBIG5_1R0. For example:

0xA2CB        0x3029        # HANGZHOU NUMERAL NINE

The first column is the  TCHBIG5 multi-byte character, the second column is the Unicode code point, and the last column is the Unicode Character name. You can use a Hexadecimal Character Literal  (e.g. SELECT  _Unicode'3029'XCF as "HANGZHOU NUMERAL NINE";)  to test the character with TCHBIG5_1R0 session Character Set. The Byte Sequence should match the very first column. You can also use the Teradata SQL Assistant to execute the SELECT statement. I just ran this test with the Teradata SQL Assistant 14.0, ODBC 14.0 and Teradata 13.0.