TPT Script - Import from TD into HDFS.

Tools
Enthusiast

TPT Script - Import from TD into HDFS.

I've been searching for this in documentation all day, maybe someone knows this already.

I have read that since version 15.0, it's possible to go from a TPT script directly into HDFS. Right now my script creates the file in local, then uploads it, then deletes it. Anyway to to this more efficiently? Maybe a code sample I can reverse engineer?

6 REPLIES
Teradata Employee

Re: TPT Script - Import from TD into HDFS.

As with any other TD-to-flat-file TPT scenario, you can use the Export-operator-to-DC-operator scenario.

This will export data from Teradata and write to HDFS.

Just provide the information for the proper DC operator attributes to talk to HDFS.

It is all documented.

-- SteveF
Enthusiast

Re: TPT Script - Import from TD into HDFS.

Is it possible to ask for an example of this code? I am not finding it in the documentation anywhere and I've been looking. Please and thank you!

Teradata Employee

Re: TPT Script - Import from TD into HDFS.

TPT provides samples in a "samples" directory where TPT is installed.

Look in the directory called "userguide" inside "samples". 

PTS00029 shows an example of reading from HDFS and loading into Teradata.

Going the other way around is pretty simple and inuititive.

The documentation does provide the information for the needed attributes.

Reading from HDFS and writing to HDFS is exactly the same as reading/writing from/to flat files on a local filesystem except with HDFS you provide us with the HadoopHost hostname or IP address.

It is that simple.

-- SteveF
Enthusiast

Re: TPT Script - Import from TD into HDFS.

Thanks, I got it. Just gotta put HadoopHost = 'default' in the target attributes and use the HDFS:://server in the FileName!

Enthusiast

Re: TPT Script - Import from TD into HDFS.

I keep hitting file not found error with absolute path/relative path for the file. Also assigned value for HadoopHost with default and host IP address.

I am not running on sandbox instead I am using my application server(linux).

Any thing am missing here. Could you pls sugget.

Enthusiast

Re: TPT Script - Import from TD into HDFS.

Hi,

Did you get the resolution for the same? I'm also getting below error:

java.lang.IllegalArgumentException: Wrong FS: hdfs://<Cluster_Name>/user/<User_Name>/tpt/ExtractFramework, expected: file:///

 

Below is the log on console.

Teradata Parallel Transporter Version 15.10.01.02 64-Bit
Job log: /opt/teradata/client/15.10/tbuild/logs/XXXXXX_06_009-3.out
Job id is XXXXXX_06_009-316_06_009-3, running on <HostName>
Teradata Parallel Transporter DataConnector Operator Version 15.10.01.02
o_FileWritter[1]: Instance 1 directing private log report to 'DataConnector-1'.
Teradata Parallel Transporter Export Operator Version 15.10.01.02
o_ExportOper: private log specified: Export
o_FileWritter[1]: DataConnector Consumer operator Instances: 1
o_ExportOper: connecting sessions
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/avro-tools-1.7.6-cdh5.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/pig-0.12.0-cdh5.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/slf4j-simple-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
17/05/10 20:44:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
o_FileWritter[1]: ECI operator ID: 'o_FileWritter-26608'
hdfsExists: invokeMethod((Lorg/apache/hadoop/fs/Path;)Z) error:
java.lang.IllegalArgumentException: Wrong FS: hdfs://<Cluster_Name>/user/<User_Name>/tpt/ExtractFramework/<XXXXXXX>.txt, expected: file:///
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:662)
	at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82)
	at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:593)
	at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811)
	at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588)
	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:425)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1417)
hdfsOpenFile(hdfs://<Cluster_Name> /user/<User_Name>/tpt/ExtractFramework/<XXXXXXX>.txt): FileSystem#create((Lorg/apache/hadoop/fs/Path;ZISJ)Lorg/apache/hadoop/fs/FSDataOutputStream;) error:
java.lang.IllegalArgumentException: Wrong FS: hdfs://<Cluster_Name> /user/<User_Name>/tpt/ExtractFramework, expected: file:///
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:662)
	at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82)
	at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:505)
	at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:491)
	at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:687)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:446)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:433)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:925)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
o_FileWritter[1]: TPT19434 pmOpen failed. General failure (34): 'pmHdfsDskOpen: File: 'hdfs://<Cluster_Name> /user/<User_Name>/tpt/ExtractFramework/<XXXXXXX>.txt' (Unknown error 255)'
o_FileWritter[1]: TPT19304 Fatal error opening file.
o_FileWritter[1]: TPT19015 TPT Exit code set to 12.
o_ExportOper: disconnecting sessions
o_FileWritter[1]: Total files processed: 0.
o_ExportOper: Total processor time used = '0.781881 Second(s)'
o_ExportOper: Start : Wed May 10 20:44:14 2017
o_ExportOper: End   : Wed May 10 20:44:31 2017
Job step MAIN_STEP terminated (status 8)
Job MODEL_EBI_N16_06_009 terminated (status 8)
Job start: Wed May 10 20:44:09 2017
Job end:   Wed May 10 20:44:31 2017

Any advice from anyone?

 

Thanks & Regards,

Arpan.