TDCH CLI 1.3 problems with export from hdfs to teradata13.10

Connectivity
Enthusiast

TDCH CLI 1.3 problems with export from hdfs to teradata13.10

Hi,

I am trying to export data from hdfs to teradata using teradata connector cli version 1.3. I am contsantly hit with below error while executing in batchinsert mode. 

--- Error: com.teradata.hadoop.exception.TeradataHadoopSQLException: java.sql.BatchUpdateException: [Teradata JDBC Driver] [TeraJDBC 14.00.00.39] [Error 1338] [SQLState HY000] A failure occurred while executing a PreparedStatement batch request. Details of the failure can be found in the exception chain that is accessible with getNextException.

Teradata version is 13.10

Hadoop version is HDP2.1

TDCH connector - 1.3

export command used is as below. There are intotal 1.3 billion records in the hdfs file.

hadoop jar $TDCH_JAR com.teradata.hadoop.tool.TeradataExportTool -url jdbc:teradata://1.1.1.1/DATABASE=TESTDB -username user1 -password pwd123 -jobtype hdfs -sourcepaths /apps/hive/warehouse/tab1 -nummappers 100 -separator '|' -targettable td_tab1

echo $TDCH_JAR

/usr/lib/tdch/teradata-connector-1.3.jar

Error: com.teradata.connector.common.exception.ConnectorException: java.sql.BatchUpdateException: [Teradata JDBC Driver] [TeraJDBC 14.00.00.39] [Error 1338] [SQLState HY000] A failure occurred while executing a PreparedStatement batch request. Details of the failure can be found in the exception chain that is accessible with getNextException.
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:147)
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:136)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatchDMLArray(TDPreparedStatement.java:253)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatch(TDPreparedStatement.java:2352)
at com.teradata.connector.teradata.TeradataBatchInsertOutputFormat$TeradataRecordWriter.write(TeradataBatchInsertOutputFormat.java:143)
at com.teradata.connector.teradata.TeradataBatchInsertOutputFormat$TeradataRecordWriter.write(TeradataBatchInsertOutputFormat.java:110)
at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.write(ConnectorOutputFormat.java:107)
at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.write(ConnectorOutputFormat.java:65)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at com.teradata.connector.common.ConnectorMMapper.map(ConnectorMMapper.java:129)
at com.teradata.connector.common.ConnectorMMapper.run(ConnectorMMapper.java:117)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata JDBC Driver] [TeraJDBC 14.00.00.39] [Error 1339] [SQLState HY000] A failure occurred while executing a PreparedStatement batch request. The parameter set was not executed and should be resubmitted individually using the PreparedStatement executeUpdate method.
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDriverJDBCException(ErrorFactory.java:93)
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDriverJDBCException(ErrorFactory.java:63)
at com.teradata.jdbc.jdbc_4.statemachine.PreparedBatchStatementController.handleRunException(PreparedBatchStatementController.java:95)
at com.teradata.jdbc.jdbc_4.statemachine.StatementController.runBody(StatementController.java:129)
at com.teradata.jdbc.jdbc_4.statemachine.PreparedBatchStatementController.run(PreparedBatchStatementController.java:57)
at com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:381)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatchDMLArray(TDPreparedStatement.java:233)

Can someone please guide, as to how to get this export working? I have tried it numerouse times and everytime a different number of records get inserted before the entire job errors out.

I also get exactly same error when I try to use horton works connector for teradata on HDP2.0 platform.

Thanks,

Anand

1 REPLY
Enthusiast

Re: TDCH CLI 1.3 problems with export from hdfs to teradata13.10

An update on above post is that, I have managed to load data using internal.fastload method. Having the field seperator on hdfs / hive as \u0001 made a massive difference in performace compared to pipe (|), not sure why.

I am still looking for answer to the issue posted above as to why batch.insert does not work.

However the command I used to laod data using fastload method is as below.

hadoop jar $TDCH_JAR com.teradata.connector.common.tool.ConnectorExportTool -url jdbc:teradata://1.1.1.1/DATABASE=test_db -username user123 -password pwd123 -jobtype hdfs -fileformat textfile -method internal.fastload -separator "\u0001" -sourcepaths /apps/hive/warehouse/tab1 -targettable td_tab1 -nummappers 15