Export data from Teradata DB on AWS to Hadoop in a remote AWS cluster

Teradata Database on AWS
Enthusiast

Export data from Teradata DB on AWS to Hadoop in a remote AWS cluster

We are currently working on a requirement where we need to export data from the Teradata DB on AWS(standalone server) to the Hadoop File System of a remote AWS cluster.We have decided on using FastExport/TPT to export the data.

  1. As per my understanding we need FastExport/TPT utility to be installed on the the name node of the remote AWS cluster in order to dump the data from the Teradata on the standalone server to the the Hadoop of the AWS cluster?
  2. Do we need to have FastExport/TPT client installed in the standalone TD server?
  3. I could not find the FastExport/TPT clients for any Unix/Linux systems in the Teradata downloads page.Do need to buy it from Teradata separately?

 

Please let me know if you need any further information.

Regards,

Indranil Roy

2 REPLIES
Teradata Employee

Re: Export data from Teradata DB on AWS to Hadoop in a remote AWS cluster

Question

 

1. As per my understanding we need FastExport/TPT utility to be installed on the the name node of the remote AWS cluster in order to dump the data from the Teradata on the standalone server to the the Hadoop of the AWS cluster?

 [Answer] Fastexport does not come pre-installed on the AWS Teradata images. FastExport can be installed on just about any server that we offer fast export for. It can be installed on a Windows machine, a TPA node, or a non-tpa node, or other Linux environemnt. You just need to make sure that the Teradata Server, the Hadoop server and any other servers involved all have access to each other and that the ports and IP addresses allow them to connect to each other.

 

2. Do we need to have FastExport/TPT client installed in the standalone TD server?

[ Answer ] No

 

3. I could not find the FastExport/TPT clients for any Unix/Linux systems in the Teradata downloads page.Do need to buy it from Teradata separatel

Fastexport does not come pre-installed on the image but the Linux and Windows Client software packages should be available on your image under:

 /var/opt/teradata/TTU_pkgs

You should see the following packages (your versions will probably be different than mine:

 

SMP001-01:/var/opt/teradata/TTU_pkgs # ls -l
total 734492
-rwx------ 1 ec2-user ec2-user 221043137 Dec 21 16:57 DatabaseManagement__windows_i386-x8664.15.10.10.00.zip
-rwx------ 1 ec2-user ec2-user 102429801 Dec 21 17:30 TeradataToolsAndUtilitiesBase__Linux_i386-x8664.15.10.10.00.tar.gz
-rwx------ 1 ec2-user ec2-user  37156769 Dec 21 17:29 TeradataToolsAndUtilitiesBase__MACOSX_x8664.15.10.10.00.tar.gz
-rwx------ 1 ec2-user ec2-user 390724487 Dec 21 17:31 TeradataToolsAndUtilitiesBase__windows_i386-x8664.15.10.10.00.zip

 

The Linux versoin is for other non-Teradata systems.

Teradata Employee

Re: Export data from Teradata DB on AWS to Hadoop in a remote AWS cluster

Another option is to purchase QueryGrid so you can connect Teradata with Hadoop.

It has to be purchased separately.

 

You can request information here:   http://www.teradata.com/products-and-services/aws/products/

 

Click  on the contact us from this page or from the “Teradata QueryGrid” linked page.

 

Teradata QueryGrid

Teradata QueryGrid lets your business work with a seamless data fabric across all of your data and analytical engines for no-hassle analytics. You can get the most value out of all your data by taking advantage of specialized processing engines operating as a cohesive analytic environment. Teradata QueryGrid enables you to minimize data movement and process data where it resides and transparently automate analytic processing and data movement between systems.

NEW! Users may now query data in Amazon S3 by using a Teradata QueryGrid connector. The Teradata-to-Presto connector converts data from file storage in S3 to a Teradata format in order to land the data in Teradata Database tables. This approach enables users to process the S3 data in-memory, thereby potentially saving money by avoiding AWS data egress charges for moving data out of S3. The connector may also be used to move a subset of data to Teradata rather than batch movement of full tables.

Teradata QueryGrid is an optional paid AMI available for use with Teradata Database Enterprise.