unicode utf-8 latin encoding

Database
Enthusiast

unicode utf-8 latin encoding

I am performing some JSON-Shredding in a Java-based client. This program connects to the database using the /CHARSET=UTF8 parameter. The program basically reads a LATIN varchar field, parses its contents, and then produces some tables from that JSON-model.

 

When inserting some data into one of these output tables, I receive a “non translatable character” exception while trying to write the Unicode 2019 symbol.

 

To fix the problem for now, I simply replace this character. I wonder if there is a more robust solution, such as filtering all Unicode-characters that do not map with the LATIN character set.

 

Are there any best practices regarding character encoding?

 

I’m rather surprised, since I am reading the data from Teradata in UTF8 form (which I could successfully validate) and then write it back to the database. Looks like something’s happening on the way to the client and back.

1 REPLY
Senior Supporter

Re: unicode utf-8 latin encoding

Did you consider to define the target column with unicode character set?