emoji Characters in Teradata

Database
Enthusiast

emoji Characters in Teradata

How to handle Emoji characters in teradata. we are extracting data using Java JDBC with UTF-8 mode. in file the data was wrtitten as "King▒<9F><99><87> Pearl▒<9F><92><8B>" and the actual data was "King🙇 Pearl💋" .

when loading this content teradata using TPT load with UTF-8 charset, getting following error "TPT19003 Delimited Data Parsing error: Invalid multi-byte character ". 

  • Is teradata supporting emoji characters ?
  • If yes what is the best way to load it into teradata.
  • Do we need t use any different charset other than UTF-8

 

Please give some solution.

 

  • emoji
  • Emoji characters in teradata
  • TPT LOAD
5 REPLIES
Teradata Employee

Re: emoji Characters in Teradata

Please start by reading the documentation on the Teradata Database16.00 Unicode Pass Through feature in chapter 9 of the International Character Set Support reference: http://www.info.teradata.com/download.cfm?ItemID=1007186.

 

Also, the 3- byte UTF-8 code sequences are invalid. The first byte is probably missing. Are you refereing to the following Unicode characters?:

  

  ‎U+1F647  PERSON BOWING DEEPLY
  ‎U+1F48B  KISS MARK
 

Please supply the release versions for the client and database software.

 

Thanks,

 

-Dave

Teradata Employee

Re: emoji Characters in Teradata

Emoji characters in your example indicates surrogate code points and requires 4-byte UTF16 characters (e.g. 🙇 =U+D83D, U+DE47). Those characters are not supported by Teradata 15.x. With Teradata 16.0 Unicode Pass Through (UPT), you can store and retrieve any Unicode characters including those Emoji characters. Please refer to the new version of the orange book that discuss UPT in Section 6.15. 

Getting Started: International Character Sets and the Teradata Database (version G01, 6/5/2017). 

 

With the Unicode Tool Kit (UTK), you can also insert/retrive any Unicode characters even on Teradata database 15.x/14.x. UTK can be downloaded from the developer exchange. 

http://downloads.teradata.com/download/tools/unicode-tool-kit

After download, go to:

..\utk_release1.6.0.1\04 TranslationUDFs\01 Teradata UDFs\suselinux-x8664\udf_installation\pass-through UDFs

 

In the latest version of UTK 1.6.0.x, you can find out the Internationalization orange book as well. 

 

TT

Enthusiast

Re: emoji Characters in Teradata

Thank You TT ...

 

So can't we handle these emoji characters in Fast Load on 15.x. Is there any way we can load these values through TPT using Load ?

Teradata Employee

Re: emoji Characters in Teradata

 

Q: So can't we handle these emoji characters in Fast Load on 15.x.

A: No, without staging tables. If you take two-step loading process, Yes. First, load eveything into Latin staging tables using ASCII session. Then, insert-select with UDF calls and move/convert to Unicode target tables. 

 

Q: Is there any way we can load these values through TPT using Load ?

A: No - with DBS 15.x and TTU 15.x. Yes - with DBS 16.0 and TTU 16.0. 

 

Tak

Enthusiast

Re: emoji Characters in Teradata

Thank You TT.