Cloudera Connector Powered by Teradata - Charset UTF8 Problem with "Special Chars like €"

Connectivity

Cloudera Connector Powered by Teradata - Charset UTF8 Problem with "Special Chars like €"

There is a thread in database forum that it should be here: http://forums.teradata.com/forum/database/teradataimporttool-charset-problem

---

Hi.


I am importing data from teradata to hadoop with "Teradata Connector for Hadoop (Command Line Edition): Cloudera" v1.2:


I have a table like this:

create table testable (

  id int not null,

  value varchar(50),

  text varchar(200),

  PRIMARY KEY (id)

);

And I have inserted this data:

insert into testtable values (1, '#1€', 'aá');

insert into testtable values (2, '#2€', 'eé');

The import job works normally:

export USERLIBTDCH=/usr/lib/tdch/teradata-connector-1.2.jar

hadoop jar $USERLIBTDCH com.teradata.hadoop.tool.TeradataImportTool -classname com.teradata.jdbc.TeraDriver -url jdbc:teradata://teradataServer/ DATABASE=test,CHARSET=UTF8 -username dbc -password dbc -jobtype hdfs -fileformat textfile -targetpaths /temp/hdfstable -sourcetable testtable -splitbycolumn id

But the resulting file in hdfs:

1 #1? a?

2 #2? e?

How can I import "special" characters from teradata to hadoop (UTF-8)? If I use the jdbc driver directly (e.g. java program), it works ok. the problem seems to be in the connector...