Teradata Connector for Hadoop Now Available

Connectivity
Connectivity covers the mechanisms for connecting to the Teradata Database, including driver connectivity via JDBC or ODBC.
Teradata Employee

Re: Teradata Connector for Hadoop now available

There is a new version, Cloudera Connector Powered by Teradata, that you may want to try out...it takes advantage of TDCH.  It's available on Cloudera's website for download.

It's a matter of preference if you prefer to use Sqoop Command Line with the Cloudera Connector Powered by Teradata or Command Line with TDCH.  Since the Cloudera Connector Powered by Teradata uses TDCH, the performance is similar to TDCH command line.

-Hau

Teradata Employee

Re: Teradata Connector for Hadoop now available

I would like to know who is using TDCH, and what stage people are in with respect to their deployment.

Please send me an email (hau.nguyen@teradata.com) and let me know. Please indicate customer name if you are a Teradata customer.

Thanks,

-Hau

Teradata Employee

Re: Teradata Connector for Hadoop now available

The Sqoop Integration Edition is for Hadoop distributions to use to integrate with Sqoop.  For example, Hortonworks has used it to create the "Hortonworks Connector for Teradata".  Cloudera has used it to create the "Cloudera Connector Powered by Teradata".  The products use the Sqoop Command Line.

The Teradata Connector for Hadoop (Command Line Edition), doesn't use Sqoop Command Line, just command line.

Teradata Connector for Hadoop is currently certified with HDP 1.1.0.17 and HDP 1.3.

-Hau

Re: Teradata Connector for Hadoop now available

Test it in Hadoop 2.1.0, failed on incompatible interface.

Exception:

java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

        at com.teradata.hadoop.mapreduce.TeradataInputFormat.getSplits(TeradataInputFormat.java:131)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:476)

        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:493)

        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:390)

        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)

Could you please share wether there is any time plan to support hadoop2+?
Enthusiast

Re: Teradata Connector for Hadoop now available

hi..I'm getting the following error message when running the command to import data from Teradata to Hadoop.

13/10/31 15:37:25 INFO tool.TeradataImportTool: TeradataImportTool starts at 1383248245608

13/10/31 15:37:25 INFO mapreduce.TeradataInputProcessor: job setup starts at 1383248245829

13/10/31 15:37:28 INFO db.TeradataConnection: CREATE MULTISET TABLE "_033728", DATABLOCKSIZE = 130048 BYTES, NO FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT ("c1" CHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c2" CHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL, "c3" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c4" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c5" DECIMAL(5, 2) NULL, "c6" DECIMAL(5, 2) NULL, "c7" DECIMAL(5, 2) NULL, "c8" DECIMAL(5, 2) NULL, "c9" DECIMAL(5, 2) NULL, "c10" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c11" DATE NULL, "c12" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c13" DATE NULL, "c14" INTEGER NULL, "c15" INTEGER NULL, "c16" INTEGER NULL, "c17" VARCHAR(7) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c18" BYTEINT NULL, "c19" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL, "c20" VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c21" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c22" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c23" CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c24" DECIMAL(5, 2) NULL, "c25" DECIMAL(5, 2) NULL, "c26" DECIMAL(5, 2) NULL, "c27" DECIMAL(5, 2) NULL, "c28" DECIMAL(5, 2) NULL, "c29" DATE NULL, "c30" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c31" VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c32" VARCHAR(60) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c33" VARCHAR(13) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c34" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c35" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c36" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c37" VARCHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c38" CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL, "c39" VARCHAR(9) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c40" VARCHAR(21) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c41" VARCHAR(40) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c42" VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c43" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c44" DATE NULL, "c45" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c46" SMALLINT NULL, "c47" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c48" DATE NULL, "c49" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c50" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c51" BYTEINT NULL, "c52" DECIMAL(11, 2) NULL, "c53" DECIMAL(11, 2) NULL, "c54" BYTEINT NULL, "c55" DECIMAL(11, 2) NULL, "c56" DECIMAL(11, 2) NULL, "c57" VARCHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c58" VARCHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c59" VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c60" VARCHAR(40) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c61" VARCHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c62" DECIMAL(12, 2) NOT NULL, "c63" DECIMAL(11, 2) NULL, "c64" CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c65" VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c66" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c67" CHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c68" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c69" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c70" DECIMAL(9, 2) NULL, "c71" VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c72" CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c73" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c74" CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c75" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c76" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c77" VARCHAR(7) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c78" DATE NULL, "c79" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c80" DATE NULL, "c81" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c82" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c83" DATE NULL, "c84" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c85" VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c86" VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c87" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c88" CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c89" VARCHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c90" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c91" BYTEINT NULL, "c92" DECIMAL(3, 1) NULL, "c93" DECIMAL(11, 2) NULL, "c94" DECIMAL(11, 2) NULL, "c95" VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c96" DATE NULL, "c97" INTEGER NULL, "c98" DATE NULL, "c99" VARCHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c100" VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c101" VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c102" CHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c103" CHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c104" VARCHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c105" VARCHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c106" VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NULL, "c107" DECIMAL(5, 1) NULL, "TDIN_PARTID" INTEGER NULL) PRIMARY INDEX ("TDIN_PARTID") PARTITION BY "TDIN_PARTID"

13/10/31 15:37:31 INFO mapreduce.TeradataSplitByPartitionInputProcessor: stage table "_033728" is created

13/10/31 15:37:31 INFO db.TeradataConnection: CREATE VIEW "V_033728" ("c1","c2","c3","c4","c5","c6","c7","c8","c9","c10","c11","c12","c13","c14","c15","c16","c17","c18","c19","c20","c21","c22","c23","c24","c25","c26","c27","c28","c29","c30","c31","c32","c33","c34","c35","c36","c37","c38","c39","c40","c41","c42","c43","c44","c45","c46","c47","c48","c49","c50","c51","c52","c53","c54","c55","c56","c57","c58","c59","c60","c61","c62","c63","c64","c65","c66","c67","c68","c69","c70","c71","c72","c73","c74","c75","c76","c77","c78","c79","c80","c81","c82","c83","c84","c85","c86","c87","c88","c89","c90","c91","c92","c93","c94","c95","c96","c97","c98","c99","c100","c101","c102","c103","c104","c105","c106","c107") AS SELECT top 10 * FROM wo_header1

13/10/31 15:37:31 INFO mapreduce.TeradataInputProcessor: job setup ends at 1383248251454

13/10/31 15:37:31 INFO mapreduce.TeradataInputProcessor: job setup time is 5s

13/10/31 15:37:31 ERROR tool.TeradataImportTool: com.teradata.hadoop.exception.TeradataHadoopSQLException: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 14.00.00.01] [Error 3524] [SQLState 42000] The user does not have CREATE VIEW access to database tedw.

        at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDatabaseSQLException(ErrorFactory.java:307)

        at com.teradata.jdbc.jdbc_4.statemachine.ReceiveInitSubState.action(ReceiveInitSubState.java:102)

        at com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.subStateMachine(StatementReceiveState.java:298)

        at com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.action(StatementReceiveState.java:179)

        at com.teradata.jdbc.jdbc_4.statemachine.StatementController.runBody(StatementController.java:120)

        at com.teradata.jdbc.jdbc_4.statemachine.StatementController.run(StatementController.java:111)

        at com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:372)

        at com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:314)

        at com.teradata.jdbc.jdbc_4.TDStatement.doNonPrepExecute(TDStatement.java:277)

        at com.teradata.jdbc.jdbc_4.TDStatement.execute(TDStatement.java:1087)

        at com.teradata.hadoop.db.TeradataConnection.executeDDL(TeradataConnection.java:374)

        at com.teradata.hadoop.db.TeradataConnection.createView(TeradataConnection.java:421)

        at com.teradata.hadoop.mapreduce.TeradataSplitByPartitionInputProcessor.setupDatabaseEnvironment(TeradataSplitByPartitionInputProcessor.java:296)

        at com.teradata.hadoop.mapreduce.TeradataInputProcessor.setup(TeradataInputProcessor.java:57)

        at com.teradata.hadoop.mapreduce.TeradataSplitByPartitionInputProcessor.setup(TeradataSplitByPartitionInputProcessor.java:67)

        at com.teradata.hadoop.job.TeradataImportJob.runJob(TeradataImportJob.java:86)

        at com.teradata.hadoop.tool.TeradataJobRunner.runImportJob(TeradataJobRunner.java:119)

        at com.teradata.hadoop.tool.TeradataImportTool.run(TeradataImportTool.java:41)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

        at com.teradata.hadoop.tool.TeradataImportTool.main(TeradataImportTool.java:392)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

        at com.teradata.hadoop.mapreduce.TeradataSplitByPartitionInputProcessor.setupDatabaseEnvironment(TeradataSplitByPartitionInputProcessor.java:304)

        at com.teradata.hadoop.mapreduce.TeradataInputProcessor.setup(TeradataInputProcessor.java:57)

        at com.teradata.hadoop.mapreduce.TeradataSplitByPartitionInputProcessor.setup(TeradataSplitByPartitionInputProcessor.java:67)

        at com.teradata.hadoop.job.TeradataImportJob.runJob(TeradataImportJob.java:86)

        at com.teradata.hadoop.tool.TeradataJobRunner.runImportJob(TeradataJobRunner.java:119)

        at com.teradata.hadoop.tool.TeradataImportTool.run(TeradataImportTool.java:41)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

        at com.teradata.hadoop.tool.TeradataImportTool.main(TeradataImportTool.java:392)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

13/10/31 15:37:31 INFO tool.TeradataImportTool: job completed with exit code 10000

Command I used is:

hadoop jar $USERLIBPATH com.teradata.hadoop.tool.TeradataImportTool -classname com.teradata.jdbc.TeraDriver -url "jdbc:teradata://XYZ/DATABASE=dbc" -username uname -password pswd -jobtype hdfs -fileformat textfile -sourcequery 'SELECT top 10 * FROM wo_header1'  -method split.by.partition -separator "," -targetpaths /user/hdfs/WO_Header1

Enthusiast

Re: Teradata Connector for Hadoop now available

Ask your DBA to run "grant create view on tedw to uname; "

Enthusiast

Re: Teradata Connector for Hadoop now available

Hi. Is there any workaround to run the command without having create view access for the user? I tried using an userID which had create view access, but it is throwing me error " The user does not have CREATE TABLE access to database dbc". Does the userID requires both create table and create view access for the database to run this command? 

Enthusiast

Re: Teradata Connector for Hadoop now available

Since you are using a sourcequery you will need create view and create table access. I think you should set the database to something other than DBC, probably tedw if you have create table and create view access.

Also if you are just trying to play with the tool instead of a sourcequery provide a source table, that way you should not need CT or CV access. 

Finally I think you should read this document Teradata Connector for Hadoop Tutorial - Teradata Developer ...

and if you are not familiar with how Teradata works, it would be helpful talk to your DBA and review this document with the DBA.

Enthusiast

Re: Teradata Connector for Hadoop now available

Hi,

Thank you thedba.

I got permissions from my user and I'm able to import the table to hdfs using teradata connector(version 1.0.9). Now I'm trying to import the data from teradata directly into hive table. I'm using the following exports and command.

Export commands

export HADOOP_HOME=/usr/lib/hadoop

export HIVE_HOME=/usr/lib/hive

export USERLIBTDCH=/usr/lib/tdch/teradata-connector-1.0.9.jar

export HADOOP_CLASSPATH=$HIVE_HOME/lib/hive-metastore-0.9.0.15.jar:$HIVE_HOME/lib/libthrift-0.7.0.jar:$HIVE_HOME/lib/hive-exec-0.9.0.15.jar:$HIVE_HOME/lib/libfb303-0.7.0.jar:$HIVE_HOME/lib/jdo2-api-2.3-ec.jar:$HIVE_HOME/conf:$HIVE_HOME/lib/slf4j-api-1.6.1.jar:$HIVE_HOME/lib/antlr-runtime-3.0.1.jar:$HIVE_HOME/lib/datanucleus-core-3.0.9.jar:$HIVE_HOME/lib/datanucleus-rdbms-3.0.8.jar:$HIVE_HOME/lib/datanucleus-connectionpool-2.0.3.jar:$HIVE_HOME/lib/mysql-connector-java.jar:$HIVE_HOME/lib/commons-dbcp-1.4.jar:$HIVE_HOME/lib/commons-pool-1.5.4.jar:$HIVE_HOME/lib/hive-cli-0.9.0.15.jar:$HIVE_HOME/lib/hive-builtins-0.9.0.15.jar

export HIVE_LIB_JARS=$HIVE_HOME/lib/hive-metastore-0.9.0.15.jar,$HIVE_HOME/lib/libthrift-0.7.0.jar,$HIVE_HOME/lib/hive-exec-0.9.0.15.jar,$HIVE_HOME/lib/libfb303-0.7.0.jar,$HIVE_HOME/lib/jdo2-api-2.3-ec.jar,$HIVE_HOME/lib/slf4j-api-1.6.1.jar,$HIVE_HOME/lib/hive-cli-0.9.0.15.jar,$HIVE_HOME/lib/hive-builtins-0.9.0.15.jar

Command for importing to Hive

hadoop jar $USERLIBTDCH com.teradata.hadoop.tool.TeradataImportTool -libjars $HIVE_LIB_JARS -classname com.teradata.jdbc.TeraDriver -url jdbc:teradata://XYZ/DATABASE=dbc -username uname -password pswd -jobtype hive -fileformat textfile -method split.by.hash -sourcetable dsstable -targettable dsstable

Error

13/11/06 15:12:43 ERROR tool.TeradataImportTool: com.teradata.hadoop.exception.TeradataHadoopException: Import Hive table's column schema is missing

        at com.teradata.hive.job.TeradataHiveImportJob.beforeJob(TeradataHiveImportJob.java:149)

        at com.teradata.hadoop.tool.TeradataJobRunner.runImportJob(TeradataJobRunner.java:118)

        at com.teradata.hadoop.tool.TeradataImportTool.run(TeradataImportTool.java:41)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

        at com.teradata.hadoop.tool.TeradataImportTool.main(TeradataImportTool.java:392)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Do I need to create the table schema in hive before i import the table into hive? I read in the userguide of this connector that the command will create the table in the hive if the table does not exist. Is there any issue with my command?


Re: Teradata Connector for Hadoop now available

Hi,

I am getting similar error like @thedba when i am trying to export data from HDFS to Teradata:

14/01/27 05:46:18 ERROR tool.TeradataExportTool: java.io.FileNotFoundException: File -username does not exist.

        at org.apache.hadoop.util.GenericOptionsParser.validateFiles(GenericOptionsParser.java:397)

        at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:288)

        at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:431)

        at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:170)

        at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

        at com.teradata.hadoop.tool.TeradataExportTool.main(TeradataExportTool.java:439)

14/01/27 05:46:18 INFO tool.TeradataExportTool: job completed with exit code 10000

Following is my command:

hadoop com.teradata.hadoop.tool.TeradataExportTool -libjars $LIB_JARS -url jdbc:teradata://Server_name/database=DB_name  -username user -password pwd -jobtype hdfs -sourcepaths /user/example2_hdfs/01 -nummappers 1 -separator ',' -targettable test1