Loading Hive Table using TDCH TPT

Tools
Enthusiast

Loading Hive Table using TDCH TPT

Hi All,

I was trying to load a Hive table using TDCH- TPT job. The sample tpt control file and jobvar file I've taken from 

"/15.10/tbuild/sample/userguide" directory. In the same environment we are able to export data from Teradata to HDFS.

My source table is in Teradata and having only 2 rows.

TPT Version: 15.10.01.02
OS Version: CentOS release 6.6 (Final)
Teradata DB Version: 15.10
Hadoop Distro: Hortonworks
TDCH Version: 1.4

My target table looks like below:

CREATE EXTERNAL TABLE `default.emp_hive`(
  `emp_id` int,
  `emp_fname` string,
  `emp_lname` string,
  `emp_loc` string
  )
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\u0001'
  LINES TERMINATED BY '\n'
STORED AS ORC
LOCATION
  'hdfs://<Full_Path_Of_Hive_Table>/emp'
;

TPT Control file is given below:

DEFINE JOB Load_Hive_table_with_records_from_TD_table
DESCRIPTION 'Load a Hive table with records from a TD table'
(

  /*These SET statements are used to associate  */
  /*the generic Hadoop job variables defined in */
  /*the jobvars2.txt file with the DataConnector*/
  /*Consumer operator template by defining job  */
  /*variables that use the DCC naming scheme    */
  SET DCCHadoopHost = @HadoopHost;
  SET DCCHadoopJobType = @HadoopJobType;
  SET DCCHadoopFileFormat = @HadoopFileFormat;
  SET DCCHadoopTargetTable = @HadoopTable;
  SET DCCHadoopTargetTableSchema = @HadoopTableSchema;

  APPLY TO OPERATOR ( $DATACONNECTOR_CONSUMER )
  SELECT * FROM OPERATOR ( $EXPORT );

);

Job variable file I'm using is below:

SourceTdpId              = 'TdpId'
,SourceUserName           = 'User'
,SourceUserPassword       = 'Password'
,SelectStmt               = 'SELECT * FROM SourceDB.SourceTbl'

/********************************************************/
/* TPT LOAD Operator attributes                         */
/********************************************************/
,DDLErrorList             = '3807'

/********************************************************/
/* TPT DataConnector Hadoop specific attributes         */
/********************************************************/
,HadoopHost               = 'default'
,HadoopJobType            = 'hive'
,HadoopFileFormat         = 'orcfile'
,HadoopTable              = 'default.emp_hive'
,HadoopTableSchema        = 'EMP_ID INT, EMP_FNAME STRING, EMP_LNAME STRING, EMP_LOC STRING'

/********************************************************/
/* APPLY STATEMENT parameters                           */
/********************************************************/
,LoadInstances            = 1

Below is my tbuild command:

tbuild -f <Path>/load_hive_tbl.tpt.ctl -v <Path>/load_hive_tbl.jobvars -j emp_hive -L <Path>

Below is the job log:

Using memory mapped file for IPC

TPT_INFRA: TPT04101: Warning: Teradata PT cannot connect to Unity EcoSystem Manager.
             The job will continue without event messages being sent to Unity EcoSystem Manager. 
TPT_INFRA: TPT04197: Warning: OMD API failed to initialize
Teradata Parallel Transporter Coordinator Version 15.10.01.02
Teradata Parallel Transporter Executor Version 15.10.01.02
Teradata Parallel Transporter Executor Version 15.10.01.02
CheckPoint Resource Manager initialized.
Checking whether a valid CheckPoint exists for restart.
MAIN_STEP            SELECT_2[0001]       $EXPORT              SQL                  24.40.45.134                                                                                                                                  
MAIN_STEP            SELECT_2[0001]       Success              $EXPORT                 1    1 INITIATE-Started     08:53:57     0.0000     0.0000      65000          0                0                0     0     0 N Y
emp_hive-594,17,5,OperatorEnter,MAIN_STEP,$EXPORT,1,2017-05-16,2017-05-16 08:53:57,2,0
emp_hive-594,176,5,OperatorVersion,MAIN_STEP,$EXPORT,1,2017-05-16,15.10.01.02,2,0
emp_hive-594,177,1,OperatorType,MAIN_STEP,$EXPORT,1,2017-05-16,1,2,0
emp_hive-594,178,1,OperatorInstances,MAIN_STEP,$EXPORT,1,2017-05-16,1,2,0
Teradata Parallel Transporter Export Operator Version 15.10.01.02
$EXPORT: private log not specified
 
     ===================================================================
     =                                                                 =
     =              TERADATA PARALLEL TRANSPORTER 64-BIT               =
     =                                                                 =
     =             EXPORT OPERATOR     VERSION 15.10.01.02             =
     =                                                                 =
     =          OPERATOR SUPPORT LIBRARY VERSION 15.10.01.02           =
     =                                                                 =
     =           COPYRIGHT 2001-2016, TERADATA CORPORATION.            =
     =                      ALL RIGHTS RESERVED.                       =
     =                                                                 =
     =                       Process I.D.: 10130                       =
     =                                                                 =
     ===================================================================

**** 08:53:57 Processing starting at: Tue May 16 08:53:57 2017
 
     ===================================================================
     =                                                                 =
     =                      Module Identification                      =
     =                                                                 =
     ===================================================================

     64-bit Export Operator for Linux release 2.6.32-504.30.3.el6.x86_64 on <Edge Node>
     ExportMain : 15.10.01.01
     ExportCLI  : 15.10.00.05
     ExportUtil : 14.10.00.01
     PcomCLI    : 15.10.01.02
     PcomMBCS   : 14.10.00.02
     PcomMsgs   : 15.10.01.01
     PcomNtfy   : 15.10.00.01
     PcomPx     : 15.10.01.01
     PcomUtil   : 15.10.01.04
     PXICU      : 15.10.01.02
     TDICU      : 15.10.01.00
     USelDir    : 15.10.01.00
MAIN_STEP            INSERT_1[0001]       $DATACONNECTOR_CONSU                                                                                                                                                                    
MAIN_STEP            INSERT_1[0001]       Success              $DATACONNECTOR_CONSU    1    1 INITIATE-Started     08:53:57     0.0000     0.0000      65000          0                0                0     0     0 N Y
Teradata Parallel Transporter DataConnector Operator Version 15.10.01.02
$DATACONNECTOR_CONSUMER[1]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.
$DATACONNECTOR_CONSUMER[1]: Instance 1 directing private log report to 'dtacop-aroy001c-10129-1'.
     =                  TraceFunction: 'NO (defaulted)' (=0)                  =
     ==========================================================================
     =                                                                        =
     =                  TERADATA PARALLEL TRANSPORTER 64-BIT                  =
     =                                                                        =
     =              DATACONNECTOR OPERATOR VERSION  15.10.01.02               =
     =                                                                        =
     =           DataConnector UTILITY LIBRARY VERSION 15.10.01.04            =
     =                                                                        =
     =    COPYRIGHT 2001-2014, Teradata Corporation.  ALL RIGHTS RESERVED.    =
     =                                                                        =
     ==========================================================================
      
     Operator name: '$DATACONNECTOR_CONSUMER' instance 1 of 1 [Consumer]
      
**** 08:53:57 Processing starting at: Tue May 16 08:53:57 2017

     ==========================================================================
     =                                                                        =
     =                    Operator module static specifics                    =
     =                                                                        =
     =          Operator module name:'dtacop', version:'15.10.01.02'          =
     =                                                                        =
     = pmdcomt_HeaderVersion: 'Common 15.00.00.04' - packing 'pack (push, 1)' =
     = pmddamt_HeaderVersion: 'Common 15.00.00.01' - packing 'pack (push, 1)' =
     =                                                                        =
     ==========================================================================
      
     ==========================================================================
     =                                                                        =
     =                   > General attribute Definitions <                    =
     =                                                                        =
     =                             TraceLevel: ''                             =
     =                   EndianFlip: 'NO (defaulted)' (=0)                    =
     =                  IndicatorMode: 'NO (defaulted)' (=0)                  =
     =                  NullColumns: 'YES (defaulted)' (=1)                   =
     =                       SYSTEM_CharSetId: 'ASCII'                        =
     =                                                                        =
     ==========================================================================
      
     LITTLE ENDIAN platform
     Operator 'dtacop' main source version:'15.10.00.27'
     DirInfo global variable name: 'DirInfo'
     FileNames global variable name: 'FileNames'
     DC_PREAD_SM_TOKENS global variable name: 'DC_PREAD_SM_TOKENS'
      
     ==========================================================================
     =                                                                        =
     =                   > Operator attribute Definitions <                   =
     =                                                                        =
     ==========================================================================
      
$DATACONNECTOR_CONSUMER[1]: DataConnector Consumer operator Instances: 1
     FileList: 'NO (defaulted)' (=0)
     MultipleReaders: 'NO (defaulted)' (=0)
     RecordsPerBuffer: (use default calculation per schema)
     Initializing with CharSet = 'ASCII'.
     Alphabetic CSName=ASCII
     Established character set ASCII
     Single-byte character set in use
      
     ==========================================================================
     =                                                                        =
     =                         Module Identification                          =
     =                                                                        =
     ==========================================================================
      
     64-bit DataConnector operator for Linux release 2.6.32-504.30.3.el6.x86_64 on <Edge Node>
     TDICU................................... 15.10.01.00
     PXICU................................... 15.10.01.02
     PMPROCS................................. 15.10.01.01
     PMRWFMT................................. 15.00.00.02
     PMHADOOP................................ 15.10.01.03
     PMTRCE.................................. 13.00.00.02
     PMMM.................................... 15.10.00.03
     DCUDDI.................................. 15.10.01.02
     PMHEXDMP................................ 15.10.00.02
     PMHDFSDSK............................... 15.10.00.02
     PMUNXDSK................................ 15.10.01.01
      
     >> Enter DC_DataConFileInfo
     Job Type=2
     WARNING!  OpenMode attribute not specified, default 'Write' being used.
$DATACONNECTOR_CONSUMER[1]: TPT19203 Required attribute 'OpenMode' not found.  Defaulting to 'Write'.
     UseGeneralUDDIcase: 'NO (defaulted)' (=0)
     WriteBOM: 'NO (defaulted)' (=0)
     AcceptExcessColumns: 'NO (defaulted)' (=0)
     AcceptMissingColumns: 'NO (defaulted)' (=0)
     TruncateColumnData: 'NO (defaulted)' (=0)
     TruncateColumns: 'NO (defaulted)' (=0)
     TruncateLongCols: 'NO (defaulted)' (=0)
     WARNING!  RecordErrorFilePrefix attribute not specified, there is no default
     RecordErrorVerbosity: OFF (default) (=0)
     FileName: 'default.emp_hive'
     OpenMode: 'WRITE' (2)
     Format: 'FORMATTED' (3)
     IOBufferSize: 131072 (default)
      
     Full File Path: default.emp_hive
     Data Type              Ind  Length  Offset M
              INTEGER (  1)   1       4       0 N
              VARCHAR (  7)   1      20       4 N
              VARCHAR (  7)   1      20      24 N
              VARCHAR (  7)   1      15      44 N
     Schema is not all character data
     Schema is compatible with delimited data
     Validating parsing case: 1200000.
     SBCS (QUOTED DATA: No), Delimiter[0]: ''
     Delimiter: x''
     Escape Delimiter: x''
     Open Quote: x''
     Close Quote: x''
     Escape Quote: x''
     ==========================================================================
     =                                                                        =
     =                    > Log will include stats only <                     =
     =                                                                        =
     ==========================================================================
$DATACONNECTOR_CONSUMER[1]: ECI operator ID: '$DATACONNECTOR_CONSUMER-10129'
     WARNING!  Overwriting existing file 'default.emp_hive' (size 0).
**** 08:53:57 Starting to send rows to file 'default.emp_hive'
$DATACONNECTOR_CONSUMER[1]: Operator instance 1 processing file 'default.emp_hive'.
     CLIv2      : 15.10.01.01   
emp_hive-594,116,5,UtilityName,MAIN_STEP,$EXPORT,1,2017-05-16,TPT Export Operator,2,0
emp_hive-594,8,5,ExportVersionId,MAIN_STEP,$EXPORT,1,2017-05-16,15.10.01.02,2,0
emp_hive-594,115,1,UtilityId,MAIN_STEP,$EXPORT,1,2017-05-16,3,2,0
emp_hive-594,131,5,ExportTdpId,MAIN_STEP,$EXPORT,1,2017-05-16,24.40.45.134,2,0
emp_hive-594,9,5,ExportUserName,MAIN_STEP,$EXPORT,1,2017-05-16,ndw_gen_extracts,2,0
 
     ===================================================================
     =                                                                 =
     =                      Attribute Definitions                      =
     =                                                                 =
     ===================================================================

**** 08:53:57 Options in effect for this job:
              OperatorType:  Producer
              Instances:     1
              Character set: Not specified; will use default
              Checkpoint:    No checkpoint in effect
              Notify:        Not enabled
              Tenacity:      4 hour limit to successfully connect
              Sleep:         6 minute(s) between connect retries
              Date format:   INTEGERDATE
              Blocksize:     Maximum allowable
              OutLimit:      No limit in effect
 
     ===================================================================
     =                                                                 =
     =                     Column/Field Definition                     =
     =                                                                 =
     ===================================================================

     Column Name                    Offset Length Type      
     ============================== ====== ====== ========================
     "EMP_ID"                            0      4 INTEGER
     "EMP_FNAME"                         4     20 VARCHAR
     "EMP_LNAME"                        26     20 VARCHAR
     "EMP_LOC"                          48     15 VARCHAR
     ============================== ====== ====== ========================
     INDICATOR BYTES NEEDED: 1
     EXPECTED RECORD LENGTH: 66
 
     ===================================================================
     =                                                                 =
     =                   Control Session Connection                    =
     =                                                                 =
     ===================================================================

$EXPORT: connecting sessions
**** 08:53:57 Connecting to RDBMS:    '24.40.45.134'
**** 08:53:57 Connecting with UserId: 'ndw_gen_extracts'
 
**** 08:53:58 Number of Query Band data bytes sent to the RDBMS: 118
emp_hive-594,127,5,ExportDbase,MAIN_STEP,$EXPORT,1,2017-05-16
 
     ===================================================================
     =                                                                 =
     =                  Teradata Database Information                  =
     =                                                                 =
     ===================================================================

**** 08:53:58 Teradata Database Version:      '15.10.04.02'
**** 08:53:58 Teradata Database Release:      '15.10.04.02'
**** 08:53:58 Maximum request size supported: 1MB
**** 08:53:58 Session character set:          'ASCII'
**** 08:53:58 Total AMPs available:           144
**** 08:53:58 Data Encryption:                supported
**** 08:53:58 Enhanced Statement Status Level: 1
**** 08:53:58 Blocksize for this job:         64330 bytes
 
     ===================================================================
     =                                                                 =
     =                   Special Session Connection                    =
     =                                                                 =
     ===================================================================

$DATACONNECTOR_CONSUMER[1]: TPT19603 Failed to launch TDCH client. See log TDCH-TPT_log_10129.txt in TPT logs directory for details.
$DATACONNECTOR_CONSUMER[1]: TPT19015 TPT Exit code set to 12.
**** 08:54:01 Number of sessions adjusted due to TASM:      20
 
              Instance Assigned Connected Result                
              ======== ======== ========= ======================
                  1        20       20    Successful
emp_hive-594,180,1,TotalSessAssigned,MAIN_STEP,$EXPORT,1,2017-05-16,20,2,0
emp_hive-594,181,1,TotalSessConnected,MAIN_STEP,$EXPORT,1,2017-05-16,20,2,0
emp_hive-594,182,1,TASMSessionLimit,MAIN_STEP,$EXPORT,1,2017-05-16,20,2,0
              ======== ======== ========= ======================
                Total      20       20    Successful
emp_hive-594,86,0,ExportBegin,MAIN_STEP,$EXPORT,1,2017-05-16,,2,0
     AllowBufferMode: 'YES (defaulted)' (=1)
Job is running in Buffer Mode
MAIN_STEP            SELECT_2[0001]       Success              $EXPORT                 1    1 INITIATE-Ended       08:54:01     4.0000     0.4399      65000          0                0                0     0     0 N Y
MAIN_STEP            INSERT_1[0001]       Success              $DATACONNECTOR_CONSU    1    1 INITIATE-Ended       08:54:01     4.0000     0.0110      65000          0                0                0     0     0 N Y
MAIN_STEP            INSERT_1[0001]       Success              $DATACONNECTOR_CONSU    1    1 EXECUTE-Started      08:54:01     0.0000     0.0000      65000          0                0                0     0     0 N Y
MAIN_STEP            SELECT_2[0001]       Success              $EXPORT                 1    1 EXECUTE-Started      08:54:01     0.0000     0.0000      65000          0                0                0     0     0 N Y
CheckPoint No. 1 started.
MAIN_STEP            INSERT_1[0001]       Success              $DATACONNECTOR_CONSU    1    1 CHECKPOINT-Started   08:54:01     0.0000     0.0010      65000          0                0                0     0     0 N Y
MAIN_STEP            SELECT_2[0001]       Success              $EXPORT                 1    1 CHECKPOINT-Started   08:54:01     0.0000     0.0010      65000          0                0                0     0     0 N Y
$DATACONNECTOR_CONSUMER[1]: TPT19434 pmGetPos failed. General failure (34): 'Unknown Access Module failure'
$DATACONNECTOR_CONSUMER[1]: TPT19307 Fatal error checkpointing data.
TPT_INFRA: TPT02258: Error: Operator checkpointing error, status = Multi Phase Error
Task(INSERT_1[0001]): checkpoint completed, status = Operator Error
MAIN_STEP            INSERT_1[0001]       Operator Error       $DATACONNECTOR_CONSU    1    1 CHECKPOINT-Ended     08:54:01     0.0000     0.0010      65000          0                0                0     0     0 N Y
TPT_INFRA: TPT03720: Error: Checkpoint command failed with 23
TPT_INFRA: TPT02255: Message Buffers Sent/Received = 0, Total Rows Received = 0, Total Rows Sent = 0
MAIN_STEP            INSERT_1[0001]       Success              $DATACONNECTOR_CONSU    1    1 TERMINATE-Started    08:54:01     0.0000     0.0000      65000          0                0                0     0     0 N Y

$DATACONNECTOR_CONSUMER Fatal error closing file.
     Files written by this instance: 0
**** 08:54:01 Total processor time used = '0.00 Seconds(s)'
**** 08:54:01 Total files processed: 0
$DATACONNECTOR_CONSUMER[1]: Total files processed: 0.
MAIN_STEP            INSERT_1[0001]       Success              $DATACONNECTOR_CONSU    1    1 TERMINATE-Ended      08:54:01     0.0000     0.0010      65000          0                0                0     0     0 N Y
Task(SELECT_2[0001]): checkpoint completed, status = Success
MAIN_STEP            SELECT_2[0001]       Success              $EXPORT                 1    1 CHECKPOINT-Ended     08:54:02     1.0000     0.0010      65000          0                0                0     0     0 N Y
TPT_INFRA: TPT02255: Message Buffers Sent/Received = 0, Total Rows Received = 0, Total Rows Sent = 0
MAIN_STEP            SELECT_2[0001]       Success              $EXPORT                 1    1 TERMINATE-Started    08:54:02     0.0000     0.0000      65000          0                0                0     0     0 N Y
emp_hive-594,186,3,CPUTimeByInstance,MAIN_STEP,$EXPORT,1,2017-05-16,0.440933,2,0
 
     ===================================================================
     =                                                                 =
     =                        Logoff/Disconnect                        =
     =                                                                 =
     ===================================================================

$EXPORT: disconnecting sessions
**** 08:54:02 Logging off all sessions
emp_hive-594,138,0,ExportSessEnd,MAIN_STEP,$EXPORT,1,2017-05-16,,2,0
 
              Instance      Cpu Time     
              ========  ================ 
                   1        0.44 Seconds
 
**** 08:54:03 Total processor time used = '0.440933 Second(s)'
     .        Start : Tue May 16 08:53:57 2017
     .        End   : Tue May 16 08:54:03 2017
     .        Highest return code encountered = '0'.
$EXPORT: Total processor time used = '0.440933 Second(s)'
$EXPORT: Start : Tue May 16 08:53:57 2017
$EXPORT: End   : Tue May 16 08:54:03 2017
**** 08:54:03 This job terminated
emp_hive-594,179,5,OperatorEndTS,MAIN_STEP,$EXPORT,1,2017-05-16,2017-05-16 08:54:03,2,0
emp_hive-594,18,1,OperatorExit,MAIN_STEP,$EXPORT,1,2017-05-16,0,2,0
MAIN_STEP            SELECT_2[0001]       Success              $EXPORT                 1    1 TERMINATE-Ended      08:54:03     1.0000     0.0050      65000          0                0                0     0     0 N Y
Job step MAIN_STEP terminated (status 12)
Job emp_hive terminated (status 12)
Job start: Tue May 16 08:53:53 2017
Job end:   Tue May 16 08:54:03 2017
Total available memory:          20000632
Largest allocable area:          20000632
Memory use high water mark:         51016
Free map size:                       1024
Free map use high water mark:          14
Free list use high water mark:          0

Below is the TDCH log generated:

WARNING: Use "yarn jar" to launch YARN applications.
17/05/16 08:53:59 INFO tool.ConnectorImportTool: ConnectorImportTool starts at 1494939239719
17/05/16 08:54:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/05/16 08:54:01 INFO tool.ConnectorImportTool: java.io.FileNotFoundException: File /usr/hdp/2.4.3.2-1/hive/lib/ojdbc6.jar does not exist.
        at org.apache.hadoop.util.GenericOptionsParser.validateFiles(GenericOptionsParser.java:405)
        at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:299)
        at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:487)
        at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:170)
        at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at com.teradata.connector.common.tool.ConnectorImportTool.main(ConnectorImportTool.java:745)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

17/05/16 08:54:01 INFO tool.ConnectorImportTool: job completed with exit code 10000

Can someone please help us here? Is something wrong in the environment or configuration?

 

Thanks & Regards,

Arpan.

Tags (4)
3 REPLIES
Teradata Employee

Re: Loading Hive Table using TDCH TPT

The response I received from the TDCH folks was:

 

It seems that there are missing hive jars in the desired path.

It seems to be issue related to Hadoop I believe.

-- SteveF
Enthusiast

Re: Loading Hive Table using TDCH TPT

Hi Steve,

Thanks a lot for the response. After we placed "ojdbc6.jar" in "$HIVE_HOME/lib" directory, the previous issue is resolved. But now we are facing below issue:

Error: com.teradata.connector.common.exception.ConnectorException: default

Below is the TDCH TPT Log.

WARNING: Use "yarn jar" to launch YARN applications.
17/05/18 06:22:26 INFO tool.ConnectorImportTool: ConnectorImportTool starts at 1495102946984
17/05/18 06:22:28 INFO common.ConnectorPlugin: load plugins in file:/tmp/hadoop-unjar6609653272700125339/teradata.connector.plugins.xml
17/05/18 06:22:28 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
17/05/18 06:22:28 INFO hive.metastore: Trying to connect to metastore with URI thrift://ebdp-ch2-d018s.sys.comcast.net:9083
17/05/18 06:22:28 INFO hive.metastore: Connected to metastore.
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: input preprocessor com.teradata.connector.idatastream.processor.IDataStreamInputProcessor starts at:  1495102948588
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: the teradata connector for hadoop version is: 1.4.1
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: the number of mappers are 2
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: input preprocessor com.teradata.connector.idatastream.processor.IDataStreamInputProcessor ends at:  1495102948678
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: the total elapsed time of input preprocessor com.teradata.connector.idatastream.processor.IDataStreamInputProcessor is: 0s
17/05/18 06:22:28 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
17/05/18 06:22:29 INFO hive.metastore: Trying to connect to metastore with URI thrift://ebdp-ch2-d018s.sys.comcast.net:9083
17/05/18 06:22:29 INFO hive.metastore: Connected to metastore.
17/05/18 06:22:29 INFO impl.TimelineClientImpl: Timeline service address: http://ebdp-ch2-d019s.sys.comcast.net:8188/ws/v1/timeline/
17/05/18 06:22:43 INFO impl.TimelineClientImpl: Timeline service address: http://ebdp-ch2-d019s.sys.comcast.net:8188/ws/v1/timeline/
17/05/18 06:22:43 WARN mapred.ResourceMgrDelegate: getBlacklistedTrackers - Not implemented yet
17/05/18 06:22:44 INFO mapreduce.JobSubmitter: number of splits:2
17/05/18 06:22:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1492702491981_398553
17/05/18 06:22:45 INFO impl.YarnClientImpl: Submitted application application_1492702491981_398553
17/05/18 06:22:45 INFO mapreduce.Job: The url to track the job: http://ebdp-ch2-d019s.sys.comcast.net:8088/proxy/application_1492702491981_398553/
17/05/18 06:22:45 INFO mapreduce.Job: Running job: job_1492702491981_398553
17/05/18 06:22:55 INFO mapreduce.Job: Job job_1492702491981_398553 running in uber mode : false
17/05/18 06:22:55 INFO mapreduce.Job:  map 0% reduce 0%
17/05/18 06:23:03 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000001_0, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
	at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
	at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
	at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

17/05/18 06:23:03 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000000_0, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
	at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
	at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
	at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

17/05/18 06:23:08 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000001_1, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
	at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
	at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
	at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

17/05/18 06:23:14 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000000_1, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
	at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
	at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
	at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

17/05/18 06:23:14 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000001_2, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
	at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
	at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
	at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

17/05/18 06:23:24 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000000_2, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
	at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
	at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
	at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

17/05/18 06:23:26 INFO mapreduce.Job:  map 100% reduce 0%
17/05/18 06:23:26 INFO mapreduce.Job: Job job_1492702491981_398553 failed with state FAILED due to: Task failed task_1492702491981_398553_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/05/18 06:23:27 INFO mapreduce.Job: Counters: 10
	Job Counters 
		Failed map tasks=7
		Killed map tasks=1
		Launched map tasks=7
		Other local map tasks=6
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=47425
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=47425
		Total vcore-seconds taken by all map tasks=47425
		Total megabyte-seconds taken by all map tasks=121408000
17/05/18 06:23:27 WARN tool.ConnectorJobRunner: com.teradata.connector.common.exception.ConnectorException: The output post processor returns 1
17/05/18 06:23:27 INFO tool.ConnectorImportTool: ConnectorImportTool ends at 1495103007013
17/05/18 06:23:27 INFO tool.ConnectorImportTool: ConnectorImportTool time is 60s
17/05/18 06:23:27 INFO tool.ConnectorImportTool: job completed with exit code 1

Can you please help us here where we are making mistake and what we should do to resolve the issue.

 

Thanks & Regards,

Arpan.

Enthusiast

Re: Loading Hive Table using TDCH TPT

Hi Steve,

Thanks a lot for the response. After copying "ojdbc6.jar" file to $HIVE_HOME/lib directory, the issue got resolve. But we are facing below issue:

Error: com.teradata.connector.common.exception.ConnectorException: default

Below is the TDCH Log:

WARNING: Use "yarn jar" to launch YARN applications.
17/05/18 06:22:26 INFO tool.ConnectorImportTool: ConnectorImportTool starts at 1495102946984
17/05/18 06:22:28 INFO common.ConnectorPlugin: load plugins in file:/tmp/hadoop-unjar6609653272700125339/teradata.connector.plugins.xml
17/05/18 06:22:28 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
17/05/18 06:22:28 INFO hive.metastore: Trying to connect to metastore with URI thrift://ebdp-ch2-d018s.sys.comcast.net:9083
17/05/18 06:22:28 INFO hive.metastore: Connected to metastore.
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: input preprocessor com.teradata.connector.idatastream.processor.IDataStreamInputProcessor starts at:  1495102948588
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: the teradata connector for hadoop version is: 1.4.1
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: the number of mappers are 2
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: input preprocessor com.teradata.connector.idatastream.processor.IDataStreamInputProcessor ends at:  1495102948678
17/05/18 06:22:28 INFO processor.IDataStreamInputProcessor: the total elapsed time of input preprocessor com.teradata.connector.idatastream.processor.IDataStreamInputProcessor is: 0s
17/05/18 06:22:28 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
17/05/18 06:22:29 INFO hive.metastore: Trying to connect to metastore with URI thrift://ebdp-ch2-d018s.sys.comcast.net:9083
17/05/18 06:22:29 INFO hive.metastore: Connected to metastore.
17/05/18 06:22:29 INFO impl.TimelineClientImpl: Timeline service address: http://ebdp-ch2-d019s.sys.comcast.net:8188/ws/v1/timeline/
17/05/18 06:22:43 INFO impl.TimelineClientImpl: Timeline service address: http://ebdp-ch2-d019s.sys.comcast.net:8188/ws/v1/timeline/
17/05/18 06:22:43 WARN mapred.ResourceMgrDelegate: getBlacklistedTrackers - Not implemented yet
17/05/18 06:22:44 INFO mapreduce.JobSubmitter: number of splits:2
17/05/18 06:22:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1492702491981_398553
17/05/18 06:22:45 INFO impl.YarnClientImpl: Submitted application application_1492702491981_398553
17/05/18 06:22:45 INFO mapreduce.Job: The url to track the job: http://ebdp-ch2-d019s.sys.comcast.net:8088/proxy/application_1492702491981_398553/
17/05/18 06:22:45 INFO mapreduce.Job: Running job: job_1492702491981_398553
17/05/18 06:22:55 INFO mapreduce.Job: Job job_1492702491981_398553 running in uber mode : false
17/05/18 06:22:55 INFO mapreduce.Job:  map 0% reduce 0%
17/05/18 06:23:03 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000001_0, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
        at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
        at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
        at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

17/05/18 06:23:03 INFO mapreduce.Job: Task Id : attempt_1492702491981_398553_m_000000_0, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: default
        at com.teradata.connector.idatastream.IDataStreamConnection.connect(IDataStreamConnection.java:65)
        at com.teradata.connector.idatastream.IDataStreamInputFormat$IDataStreamRecordReader.initialize(IDataStreamInputFormat.java:183)
        at com.teradata.connector.common.ConnectorCombineInputFormat$ConnectorCombinePlugedinInputRecordReader.initialize(ConnectorCombineInputFormat.java:505)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:548)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:786)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Can you please help us identifying what we are doing wrong and how we can resolve the issue?

 

Thanks & Regards,

Arpan.