We have 24nodes 2650 model running 1GB link.
We might be having Teradata Hadoop by first quarter of 2015.
Our concern is, do we have any confirmed benchmark on how much time it will take to copy data, let’s say 100GB from hadoop to teradata or from teradata to hadoop using different tools.
Hadoop has 10GB link but our old model of 2650 has only 1GB link, can it accommodate big data transfer and vice versa?
Can you send us any sample data showing how much time it got loaded from each other during heavy hours using 100GB data?
We need to answer our client’s question.
We have a usecase for bidirectional data transfer between Teradata Aster and Hortonworks Data Platform. Please guide me with the following.
1)Will TDCH be sufficient ? I have had a quick glance through the above attached tutorial, but did not find any mention of Aster.
2) If TDCH is not the answer, then what is the best way to acheive the above said usecase ?
Bidirectional exchange between Teradata Aster and Hadoop can be done using "QueryGrid Aster-to-Hadoop" (formerly known as "Aster SQL-H"). The Aster SQL-MR functions load_from_hcatalog and load_to_hcatalog are an integral part of the Aster platform.
Hope this helps
I am using teradata-connector-1.3.3 with the following versions:
I am trying to import data from teradata tables to HDFS.
I create a table in the Teradata database.
CREATE MULTISET TABLE martinpoc.example3_td3 ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
c2 VARCHAR(100) CHARACTER SET UNICODE NOT CASESPECIFIC)
PRIMARY INDEX ( c1 );
I insert a data with Chinese word into this table.
INSERT INTO martinpoc.example3_td3(c1,C2) VALUES ('1','蔡先生');
I run the command call the teradata connector to export data from teredata to HDFS
hadoop com.teradata.hadoop.tool.TeradataImportTool -libjars $LIB_JARS -url jdbc:teradata://192.168.65.132/CHARSET=UTF8,database=martinpoc -username martinpoc -password martin -jobtype hdfs -sourcetable example3_td3 -separator ',' -targetpaths /user/martin/example3_td3 -method split.by.hash -splitbycolumn c1
I success to export the data from teradata to HDFS, but I found the Chinese word in the HDFS is weird. (I already using the CHARSET=UTF8 in the JDBC Url)
Do you have the same experience and how to resolve it. Thanks.
Kindly Thanks and Best Regard
Can anyone assist on this error?
java.lang.Exception: Cannot connect to WebHDFS, check host name and port.
Previously, teradata-connector 1.3.3 is working but after few day, it did work anymore. We cannot hook at hadoop file system and teradata.
I have updated the connectoer to 1.3.4 ---->>>>TeradataConnectorForHadoopVersion":"1.3.4" but still not working.
here is the log also from the edgnode where cteradata-connector and studio were installed:
Server returned HTTP response code: 403 for URL: http://gvlhdmpap02:50070/webhdfs/v1/?op=LISTSTATUS&user.name=hdfs
15:14:03.045 Teradata Datatools [Worker-6] ERROR hadoop - Could not read HDFS
this is resolved.
there was failover testing and the standby namenode becomes active.
since there is a new version of the teradata-connector, 1.3.4, I have updated it. and rerun oozie script pointing to new active node and it works
The tutorial says the Teradata JDBC URL should point to an MPP system name (and not a single node).
1) How do I specify an MPP system name in the url?
2) What are the corresponding entries in the Hadoop hosts files I need to reference the MPP nodes?
3) Which Hadoop nodes need these entries in their hosts files? Master nodes? Data nodes? Edge nodes? All three?
Hadoop to Teradata data movement using TDCH:
While using TDCH (JDBC-FastLoad), if the operation is interrupted for some reason, it must be restarted from the beginning. I have faced this issue quite frequently while large record set is loaded using JDBC FastLoad since "Checkpoint" feature is not available in JDBC connectivity. This feature is not introduced even in recent JDBC 15.x releases.
1. TDCH (Hadoop-to-Teradata) is a scalable Extract-Load solution to move data in regular interval for large IDW system
2. What is the internal latency involved in Teradata while using TDCH/QueryGrid, more specific to AWT usage