Need help with loading junk data

General
General covers Articles, Reference documentation, FAQs, Downloads and Blogs that do not belong to a specific subject area. General-purpose Articles about everything and anything
Enthusiast

Need help with loading junk data

Source: Sql server  Source field : Varchar(max)

Charcterset: Latin1_General_CP1_CL_AS

 

Loading through informatica TPT and characterset defined in the connetion is : UTF8.

Trying to load to the column for which charset defined as "unicode". But facing untransaltable charter issue.

target column:  DESCRIPTION VARCHAR(8000) CHARACTER SET UNICODE NOT CASESPECIFIC,

 

Data that trying to load is actually mail content, which has smiley symbols etc.  Any idea how to load this kind of data ?

Please let me know if you need any more details.

 

below are the charctersets installed on the Teradata.

ARABIC1256_6A0
CYRILLIC1251_2A0
EBCDIC037_0E
EBCDIC273_0E
EBCDIC277_0E
HANGUL949_7R0
HANGULEBCDIC933_1II
HANGULKSC5601_2R4
HEBREW1255_5A0
KANJI932_1S0
KANJIEBCDIC5026_0I
KANJIEBCDIC5035_0I
KANJIEUC_0U
KANJISJIS_0S
KATAKANAEBCDIC
LATIN1_0A
LATIN1250_1A0
LATIN1252_0A
LATIN1252_3A0
LATIN1254_7A0
LATIN1258_8A0
LATIN9_0A
SCHEBCDIC935_2IJ
SCHGB2312_1T0
SCHINESE936_6R0
TCHBIG5_1R0
TCHEBCDIC937_3IB
TCHINESE950_8R0
THAI874_4A0

Tags (3)

Accepted Solutions
Highlighted
Enthusiast

Re: Need help with loading junk data

Thanks Fred for the insights, we are able to load the data with your recommendations.

 

Thanks alot.

1 ACCEPTED SOLUTION
2 REPLIES
Teradata Employee

Re: Need help with loading junk data

Prior to TD16, supplemental plane characters (e.g. most emoji) cannot be stored in a UNICODE character column. And if you have TD16 / TTU16, the "Unicode Pass-Through" feature must be enabled when loading.

 

For earlier releases, none of the options are ideal; the Unicode Tool Kit may help in some cases:

  • Remove the untranslatable characters or substitute some acceptable character prior to or during loading (assuming these characters are unimportant)
  • Load the UTF8 data to a LATIN column using the ASCII session character set (data can be stored and retrieved successfully, but non-ASCII characters will appear as sequences of two to four other special characters when using standard query tools)
  • Store the data in a BYTE column (doesn't mislead anyone into thinking it's LATIN, but even harder to work with in SQL and most tools)

 

Highlighted
Enthusiast

Re: Need help with loading junk data

Thanks Fred for the insights, we are able to load the data with your recommendations.

 

Thanks alot.