TPT 15.00 Teradata <> HDP Sanbox data movement

Tools

TPT 15.00 Teradata <> HDP Sanbox data movement

Good Evening,

I'm trying to prove out using TPT 15.00 to move data back and forth between Teradata and Hadoop.  I'm hoping to get the benefit of using fastexport rather than a salvo of sqoop / tdch generated queries to pull my data.  I've installed v15.00 TTUs on the hortonworks sandbox VM, but am unable to get the jobs going.  Currently I'm trying to just use the sample scripts that came with TPT 15.00.

Here are my job variables:

[root@sandbox tpt_testing]# cat jobvars2.txt

/********************************************************/

/* TPT LOAD Operator attributes                         */

/********************************************************/

TargetTdpId               = 'TD1410VPCOP1'

,TargetUserName           = 'SYSDBA'

,TargetUserPassword       = 'SYS_2012$'

,TargetTable              = 'PTS00030_TBL'

/********************************************************/

/* TPT Export Operator attributes                       */

/********************************************************/

,SourceTdpId              = 'TD1410VPCOP1'

,SourceUserName           = 'SYSDBA'

,SourceUserPassword       = 'SYS_2012$'

,SelectStmt               = 'select * from PTS00030_TBL'

/********************************************************/

/* TPT LOAD Operator attributes                         */

/********************************************************/

,DDLErrorList             = '3807'

/********************************************************/

/* TPT DataConnector Hadoop specific attributes         */

/********************************************************/

,HadoopHost               = '10.0.0.34'

,HadoopJobType            = 'hive'

,HadoopFileFormat         = 'rcfile'

,HadoopTable              = 'PTS00030_TBL'

,HadoopTableSchema        = 'COL1 INT, COL2 STRING, COL3 STRING'

/********************************************************/

/* APPLY STATEMENT parameters                           */

/********************************************************/

,LoadInstances            = 1

[root@sandbox tpt_testing]#

Here are some of the errors / warnings encountered:

     ===================================================================

     =                                                                 =

     =                      Module Identification                      =

     =                                                                 =

     ===================================================================

     Load Operator for Linux release 2.6.32-431.11.2.el6.x86_64 on sandbox.hortonworks.com

     LoadMain   : 15.00.00.05

     LoadCLI    : 15.00.00.04

     LoadUtil   : 14.10.00.01

     PcomCLI    : 15.00.00.34

     PcomMBCS   : 14.10.00.02

     PcomMsgs   : 15.00.00.01

     PcomNtfy   : 14.10.00.05

     PcomPx     : 15.00.00.08

     PcomUtil   : 15.00.00.08

     PXICU      : 15.00.00.02

Teradata Parallel Transporter Hive_table_reader[1]: TPT19006 Version 15.00.00.02

Hive_table_reader[1]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.

Hive_table_reader[1]: TPT19010 Instance 1 directing private log report to 'dtacop-root-7782-1'.

Hive_table_reader[1]: TPT19011 Instance 1 restarting.

Hive_table_reader[1]: TPT19003 NotifyMethod: 'None (default)'

Hive_table_reader[1]: TPT19008 DataConnector Producer operator Instances: 1

     TDICU      : 15.00.00.00

Hive_table_reader[1]: TPT19203 Required attribute 'OpenMode' not found.  Defaulting to 'Read'.

Hive_table_reader[1]: TPT19003 ECI operator ID: 'Hive_table_reader-7782'

     CLIv2      : 15.00.00.03

Hive_table_reader[1]: TPT19222 Operator instance 1 processing file 'PTS00030_TBL'.

Hive_table_reader[1]: TPT19424 pmRepos failed. Request unsupported by Access Module (24)

Hive_table_reader[1]: TPT19308 Fatal error repositioning data.

Hive_table_reader[1]: TPT19015 TPT Exit code set to 12.

TPT_INFRA: TPT02263: Error: Operator restart error, status = Fatal Error

Task(SELECT_2[0001]): restart completed, status = Fatal Error

from the TDCH log:

14/11/06 19:34:38 INFO tool.TeradataExportTool: TPTExportTool starts at 1415331278661

14/11/06 19:34:41 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

14/11/06 19:34:41 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir

14/11/06 19:34:41 INFO hive.metastore: Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083

14/11/06 19:34:42 INFO hive.metastore: Connected to metastore.

java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

        at com.teradata.hive.mapreduce.TeradataHiveCombineFileInputFormat.getSplits(TeradataHiveCombineFileInputFormat.java:35)

        at com.teradata.hadoop.job.TPTExportJob.runJob(TPTExportJob.java:75)

        at com.teradata.hadoop.tool.TPTJobRunner.runExportJob(TPTJobRunner.java:193)

        at com.teradata.hadoop.tool.TPTExportTool.run(TPTExportTool.java:40)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

        at com.teradata.hadoop.tool.TPTExportTool.main(TPTExportTool.java:446)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:606)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

14/11/06 19:34:43 INFO tool.TeradataExportTool: job completed with exit code 10000

Looking online the TDCH error seems to be due to incompatability of certain jars.  However, I have the base HDP sandbox (2.1) install + loom, so not really much changed.  Any help in trouble shooting would be greatly appreciated.  I imagine learning how to track that error down to the actual jars that caused it would help...

Thanks!




Tags (4)
4 REPLIES

Re: TPT 15.00 Teradata <> HDP Sanbox data movement

Hey Steve (Feinholz),

I told you at Partners I'd mention you by name in my next post.  Well this is it :)  Any guidance you have would be great, though this is likely more suited for the TDCH folks than TPT.  Let me know if I should just open a ticket on TAYS too.

Thanks!!!

Teradata Employee

Re: TPT 15.00 Teradata <> HDP Sanbox data movement

To better diagnose the issues, I would need to see the entire script.

There are inconsistencies in what is presented.

1. you mention TPT Export operator, but the Export operator only supports Teradata.

2. Even though you mention the Export operator, the output seems to indicate you are using the DataConnector operator to interface with Hadoop. That would be correct.

3. There is an error message involving the access module, but the DC operator should not be using any access modules when with Hadoop.

Thus, please supply the entire TPT script, and we should be able to help out.

-- SteveF

Re: TPT 15.00 Teradata <> HDP Sanbox data movement

Hi,

Can anyone please share code examples for exporting from Teradata to Hadoop via TPT.

Teradata Employee

Re: TPT 15.00 Teradata <> HDP Sanbox data movement

The TPT User Guide has sample scripts.

-- SteveF