Seeing Junk character when inserting hyphen (-)

Database

Seeing Junk character when inserting hyphen (-)

Hi,

I am trying to insert a string containing hyphen (-) into a column defined as CHARACTER SET LATIN. But when I select, I see some junk character instead of '-'. Pls let me know whats the problem. Is it somthing to do with the character set (UNICODE vs LATIN) or client side character set?. The same insert had this issue in our test environment when inserted for the first time, but I deleted and reinserted it second time to find that it worked fine this time. But again in production, its showing me the junk character. I cant be a random behaviour! Pls help me understand and solve this.

Ex:

INSERT INTO TEST_TABLE ( TEST_COLUMN)

SELECT CAST('Test String - Check hyphen - character' AS VARCHAR(150)) AS  TEST_COLUMN

FROM DBC.TABLES ;

But when I select from the table TEST_TABLE, I see the below string:

'Test String – Check hyphen – character'

2 REPLIES
Teradata Employee

Re: Seeing Junk character when inserting hyphen (-)

That combination of characters is the 3-byte UTF-8 representation of EN DASH U+2013, misinterpreted as Windows-1252.

In short, you are loading UTF-8 data using ASCII as the session character set. Change the load to use UTF-8 session character set.

I would recommend you also use UTF-8 session character set when querying. But for this specific character, Teradata LATIN and Windows-1252 both use the same x'96' code point, so ASCII session character set would still display properly (as long as the data is loaded correctly).

Enthusiast

Re: Seeing Junk character when inserting hyphen (-)

The utf-8 session recommendation I knew, but the detailed explanation why in this case it is going wrong... wow.

Beside that: don't query DBC.TABLES, but query DBC.TABLESV if available on your TD system.