Scooping Data from Teradata to Hadoop

Database

Scooping Data from Teradata to Hadoop

Hi,

 

My requirement is to move data from set of tables (size  per table avg 2TB) from Teradata to Hadoop.

Scooping is taking lot of resources and time.

 

An idea occured to me (might not be great).

Whether we can extract data to a file using tpt or fexp in parts and load this file to HDFS cluster using proper delimiter ?

If yes, can i just go amp wise.

thats to say if i extract using below query.


SEL * FROM databasename.tablename HASHAMP(HASHBUCKET(HASHROW(indexcolumns)) =ampnumber

(mine is a 250 amp architecture)

 

Or if we have any faster way to move data from teradata to hadoop.

Please share the same

3 REPLIES
Enthusiast

Re: Scooping Data from Teradata to Hadoop

Could you share if the resoures bottleneck is on Teradata?

If it is on Hadoop, who cares :-)

 

Can you please provide additional information such as

  • Is this a one time activity?
  • Is Teradata and Hadoop on the same network?
  • What kind of Teradata extraction tool are you using? (any throttles on utilities)
  • What bottlenecks are you seeing?
  • Is it during the extraction phase or load into HDFS?
  • Are you loading into any Hive Schema?
  • Any Hive repository issues?
  • Please share your DBS/HDP versions

 

 

Re: Scooping Data from Teradata to Hadoop

Thanks.

PLease find the details below.

Is this a one time activity? -- Yes

Is Teradata and Hadoop on the same network? -- Believe its in the hadoop side

What kind of Teradata extraction tool are you using? (any throttles on utilities) -- Normal Scooping which hits an export

What bottlenecks are you seeing? -- slowness in  loading, the teradata end is quick and responsive when we directly write to a file.

Is it during the extraction phase or load into HDFS? -- Majorly at loading end

Are you loading into any Hive Schema? -- Yes

Any Hive repository issues? -- NO, its an empty system , dedicated to us

Please share your DBS/HDP versions -- Teradata is 15.1 , HDP is the Horton Works latest release with tap2 frame work.

 

Agreed, its smooth at teradata end. Just wanted to see , if we can move the data faster.

As scoop is going to be time consuming.

 

Enthusiast

Re: Scooping Data from Teradata to Hadoop

I suggest you start engaging both Hadoop DBA (Not sure if they exist) and Teradata DBA's to troubleshoot this issue. 

One thing that comes to mind will be to remove replication in Hadoop and set replicas to 1.