Teradata Connector for Hadoop Now Available

Connectivity
Connectivity covers the mechanisms for connecting to the Teradata Database, including driver connectivity via JDBC or ODBC.
Teradata Employee

Re: Teradata Connector for Hadoop now available

It looks like the parameter "-username" is treated as file name. Can you provide the value of $LIB_JARS, Server_name and DB_name for me to check if there's special characters?

#echo $LIB_JARS

Teradata Employee

Re: Teradata Connector for Hadoop now available

For hive table's schema missing issue, TDCH supports two ways to specify hive table's schema. If hive table exists, TDCH will get schema from the metastore, or "-sourcetableschema" parameter should be specified if you want the hive table to be created automatically.

Enthusiast

Re: Teradata Connector for Hadoop now available

I'm a Teradata DBA trying to help some Hadoop users with the TDCH tool and it is new to me.  I see there is an internal.fastload option for invoking a distributed fastload which I'm guessing is using the TPT API to load data from Hadoop to Teradata across multiple Hadoop nodes.  However there appears to be no corresponding method for exporting from Teradata to Hadoop.  I would expect there to be an "internal.fastexport" option enabling exports distributed across muliple Hadoop nodes, similar to the way the Aster Data Connector for Teradata works.  The split.by.value, split.by.hash and split.by.partition options all seem inefficient on the Teradata side, each in its own way.  Are there plans to add an option to TDCH that uses the TPT API for pulling data into Hadoop?

Enthusiast

Re: Teradata Connector for Hadoop now available

I second the request made by @KeithJones, adding a internal.fastexport like feature to TDCH for moving data out of Teradata would greatly simplify flow of data in that direction.

Fan

Re: Teradata Connector for Hadoop now available

Hi

This is my first time using Teradata Import Tool. I downloaded the jar file and put it in sqoop folder and tried to import a teradata table to hadoop as given below but I get  [SQLState 28000] The UserId, Password or Account is invalid error eventhough everything I gave is correct and I'm able to login to SQL Assistant using the same credentials. Can anyone please advise as it is urgent for me to fix this issue.

export USERLIBTDCH=/usr/lib/sqoop/teradata-connector-1.1.1-hadoop200.jar

hadoop jar $USERLIBTDCH com.teradata.hadoop.tool.TeradataImportTool -classname com.teradata.jdbc.TeraDriver -url jdbc:teradata://a.b.c.net/DATABASE=XYZ-username ROOT -password XXXX -jobtype hdfs -fileformat textfile -method split.by.hash -separator "," -sourcetable ICDW_REG_QRY -targetpaths /user/ABC/TD_REG_QRY

14/02/13 00:36:28 ERROR tool.TeradataImportTool: com.teradata.hadoop.exception.TeradataHadoopException: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 14.00.00.39] [Error 8017] [SQLState 28000] The UserId, Password or Account is invalid.

Thanks

Teradata Employee

Re: Teradata Connector for Hadoop now available

Hi KeithJones and thedba,

 

TDCH does not use the TPT API under the covers; rather it uses JDBC fastload for import to Teradata and standard JDBC driver for export from Teradata. We have support for JDBC fastexport on our long term road map, but at this point we don't have a date / release version in which this feature will be available. Thanks

 

Hi amala

Looks like your login credentials for the DBS specified by the '-url' option are invalid? Can you verify that you're able to login to that DBS with your credentials using an application like BTEQ? Thanks

Re: Teradata Connector for Hadoop now available

Hi All,

I am using teradata-connector-1.2.1 with the following versions:

->Teradata 14.0

->HDP 2.0(Hadoop 2.1.0.2.0.5.0-67)

->Hive 0.11

I am trying to import data from teradata tables to Hive.

/*Created table and inserted a row in teradata*/

CREATE MULTISET TABLE example3_td ( c1 INT,c2 VARCHAR(100));

INSERT INTO example3_td VALUES (3,'bar');

/*Created similar table in Hive*/ 

CREATE TABLE example3_hive (h1 INT, h2 STRING) STORED AS RCFILE;

Following are the set of commands I used after following the documentation:

export HIVE_HOME=/home/nirmal/hadoop/hive-0.11.0

export USERLIBTDCH=/home/nirmal/hadoop/TeradataConnectors/teradata-connector-1.2.1/lib/teradata-connector-1.2.1.jar

export HADOOP_CLASSPATH=$HIVE_HOME/lib/hive-metastore-0.11.0.jar:$HIVE_HOME/lib/libthrift-0.9.0.jar:$HIVE_HOME/lib/hive-exec-0.11.0.jar:$HIVE_HOME/lib/libfb303-0.9.0.jar:$HIVE_HOME/lib/jdo2-api-2.3-ec.jar:$HIVE_HOME/conf:$HIVE_HOME/lib/slf4j-api-1.6.1.jar:$HIVE_HOME/lib/antlr-runtime-3.4.jar:$HIVE_HOME/lib/datanucleus-core-3.0.9.jar:$HIVE_HOME/lib/datanucleus-rdbms-3.0.8.jar:$HIVE_HOME/lib/datanucleus-api-jdo-3.0.7.jar:$HIVE_HOME/lib/commons-dbcp-1.4.jar:$HIVE_HOME/lib/commons-pool-1.5.4.jar:$HIVE_HOME/lib/hive-cli-0.11.0.jar

export HIVE_LIB_JARS=$HIVE_HOME/lib/hive-metastore-0.11.0.jar,$HIVE_HOME/lib/libthrift-0.9.0.jar,$HIVE_HOME/lib/hive-exec-0.11.0.jar,$HIVE_HOME/lib/libfb303-0.9.0.jar,$HIVE_HOME/lib/jdo2-api-2.3-ec.jar,$HIVE_HOME/lib/slf4j-api-1.6.1.jar,$HIVE_HOME/lib/hive-cli-0.11.0.jar

hadoop jar $USERLIBTDCH com.teradata.hadoop.tool.TeradataImportTool -libjars $HIVE_LIB_JARS  -classname com.teradata.jdbc.TeraDriver -url jdbc:teradata://192.168.199.129/DATABASE=airlinesuser -username airlinesuser -password airlinesuser -jobtype hive -fileformat rcfile -sourcetable example3_td -nummappers 1 -targettable example3_hive


radata://192.168.199.129/DATABASE=airlinesuser -username airlinesuser -password airlinesuser -jobtype hive -fileformat rcfile -sourcetable example3_td -nummappers 1 -targettable example3_hive

14/04/10 21:14:56 INFO tool.TeradataImportTool: TeradataImportTool starts at 1397144696332

14/04/10 21:14:57 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

14/04/10 21:14:57 WARN conf.Configuration: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize

14/04/10 21:14:57 WARN conf.Configuration: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize

14/04/10 21:14:57 WARN conf.Configuration: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack

14/04/10 21:14:57 WARN conf.Configuration: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node

14/04/10 21:14:57 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces

14/04/10 21:14:57 WARN conf.Configuration: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative

14/04/10 21:14:57 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@3eafdb52:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.

14/04/10 21:14:57 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@3eafdb52:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.

14/04/10 21:14:57 ERROR tool.TeradataImportTool: com.teradata.hadoop.exception.TeradataHadoopException: Hive current user directory not exists

at com.teradata.hive.job.TeradataHiveImportJob.beforeJob(TeradataHiveImportJob.java:148)

at com.teradata.hadoop.tool.TeradataJobRunner.runImportJob(TeradataJobRunner.java:118)

at com.teradata.hadoop.tool.TeradataImportTool.run(TeradataImportTool.java:41)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at com.teradata.hadoop.tool.TeradataImportTool.main(TeradataImportTool.java:464)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

14/04/10 21:14:57 INFO tool.TeradataImportTool: job completed with exit code 25008

nirmal@nirmal-Vostro-3560 ~/hadoop/hive-0.12.0 $ 

I Hive I see the table being created in the default database/schema:

hive> show tables;

OK

example3_hive

Time taken: 5.831 seconds, Fetched: 1 row(s)

hive> 

Kindly help me if I am missing something.

Thanks,

-Nirmal

Re: Teradata Connector for Hadoop now available

Teradata DB: 14.10

Sqoop Version: 1.4

CDH4

Cloudera Connector powered by Teradata 1.2c4

I'm using Sqoop Teradata connector, getting the follwing error. Please anyone can provide details. Thanks

ERROR tool.ImportTool: Encountered IOException running import job: com.teradata.hadoop.exception.TeradataHadoopException: Malformed \uxxxx encoding

        at com.teradata.hadoop.utils.TeradataUnicodeCharacterConverter.fromEncodedUnicode(TeradataUnicodeCharacterConverter.java:90)

        at com.teradata.hadoop.db.TeradataConfiguration.setInputEscapedByString(TeradataConfiguration.java:540)

        at com.cloudera.connector.teradata.imports.BaseImportJob.configureInputFormat(BaseImportJob.java:74)

        at com.cloudera.connector.teradata.imports.TableImportJob.configureInputFormat(TableImportJob.java:32)

        at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:239)

        at com.cloudera.connector.teradata.TeradataManager.importTable(TeradataManager.java:273)

        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413)

        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:240)

Re: Teradata Connector for Hadoop now available

 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.teradata.jdbc.TeraDriver

java.lang.RuntimeException: Could not load db driver class: com.teradata.jdbc.TeraDriver

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:636)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:525)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:548)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:191)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:175)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:262)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1236)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1061)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:390)

        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Teradata Employee

Re: Teradata Connector for Hadoop now available

Hi Hau Nguyen

I have a couple of questions regarding the Hadoop connectors:

  1. What connector do I need to integrate the Hadoop and Teradata using an ETL tool like Talend or Informatica PowerCenter?
  2. Can I use all of the 3 connectors side-by-side and deploy it in a sandbox environment?

Regards,

Joseph