Import data from Teradata 14.0 to Hive Table

Connectivity

Import data from Teradata 14.0 to Hive Table

Hi All,

I am using teradata-connector-1.2.1 with the following versions:

->Teradata 14.0

->HDP 2.0(Hadoop 2.1.0.2.0.5.0-67)

->Hive 0.11


I am trying to import data from teradata tables to Hive.

/*Created table and inserted a row in teradata*/

CREATE MULTISET TABLE example3_td ( c1 INT,c2 VARCHAR(100));

INSERT INTO example3_td VALUES (3,'bar');

/*Created similar table in Hive*/ 

CREATE TABLE example3_hive (h1 INT, h2 STRING) STORED AS RCFILE;

 

Following are the set of commands I used after following the documentation:

export HIVE_HOME=/home/nirmal/hadoop/hive-0.11.0

export USERLIBTDCH=/home/nirmal/hadoop/TeradataConnectors/teradata-connector-1.2.1/lib/teradata-connector-1.2.1.jar

 

export HADOOP_CLASSPATH=$HIVE_HOME/lib/hive-metastore-0.11.0.jar:$HIVE_HOME/lib/libthrift-0.9.0.jar:$HIVE_HOME/lib/hive-exec-0.11.0.jar:$HIVE_HOME/lib/libfb303-0.9.0.jar:$HIVE_HOME/lib/jdo2-api-2.3-ec.jar:$HIVE_HOME/conf:$HIVE_HOME/lib/slf4j-api-1.6.1.jar:$HIVE_HOME/lib/antlr-runtime-3.4.jar:$HIVE_HOME/lib/datanucleus-core-3.0.9.jar:$HIVE_HOME/lib/datanucleus-rdbms-3.0.8.jar:$HIVE_HOME/lib/datanucleus-api-jdo-3.0.7.jar:$HIVE_HOME/lib/commons-dbcp-1.4.jar:$HIVE_HOME/lib/commons-pool-1.5.4.jar:$HIVE_HOME/lib/hive-cli-0.11.0.jar

 

export HIVE_LIB_JARS=$HIVE_HOME/lib/hive-metastore-0.11.0.jar,$HIVE_HOME/lib/libthrift-0.9.0.jar,$HIVE_HOME/lib/hive-exec-0.11.0.jar,$HIVE_HOME/lib/libfb303-0.9.0.jar,$HIVE_HOME/lib/jdo2-api-2.3-ec.jar,$HIVE_HOME/lib/slf4j-api-1.6.1.jar,$HIVE_HOME/lib/hive-cli-0.11.0.jar

 

hadoop jar $USERLIBTDCH com.teradata.hadoop.tool.TeradataImportTool -libjars $HIVE_LIB_JARS  -classname com.teradata.jdbc.TeraDriver -url jdbc:teradata://192.168.199.129/DATABASE=airlinesuser -username airlinesuser -password airlinesuser -jobtype hive -fileformat rcfile -sourcetable example3_td -nummappers 1 -targettable example3_hive

 

radata://192.168.199.129/DATABASE=airlinesuser -username airlinesuser -password airlinesuser -jobtype hive -fileformat rcfile -sourcetable example3_td -nummappers 1 -targettable example3_hive

14/04/10 21:14:56 INFO tool.TeradataImportTool: TeradataImportTool starts at 1397144696332

14/04/10 21:14:57 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

14/04/10 21:14:57 WARN conf.Configuration: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize

14/04/10 21:14:57 WARN conf.Configuration: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize

14/04/10 21:14:57 WARN conf.Configuration: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack

14/04/10 21:14:57 WARN conf.Configuration: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node

14/04/10 21:14:57 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces

14/04/10 21:14:57 WARN conf.Configuration: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative

14/04/10 21:14:57 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@3eafdb52:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.

14/04/10 21:14:57 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@3eafdb52:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.

14/04/10 21:14:57 ERROR tool.TeradataImportTool: com.teradata.hadoop.exception.TeradataHadoopException: Hive current user directory not exists

at com.teradata.hive.job.TeradataHiveImportJob.beforeJob(TeradataHiveImportJob.java:148)

at com.teradata.hadoop.tool.TeradataJobRunner.runImportJob(TeradataJobRunner.java:118)

at com.teradata.hadoop.tool.TeradataImportTool.run(TeradataImportTool.java:41)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at com.teradata.hadoop.tool.TeradataImportTool.main(TeradataImportTool.java:464)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 

14/04/10 21:14:57 INFO tool.TeradataImportTool: job completed with exit code 25008

nirmal@nirmal-Vostro-3560 ~/hadoop/hive-0.12.0 $ 

I Hive I see the table being created in the default database/schema:

hive> show tables;

OK

example3_hive

Time taken: 5.831 seconds, Fetched: 1 row(s)

hive> 

Kindly help me if I am missing something.

Thanks,

-Nirmal

1 REPLY

Re: Import data from Teradata 14.0 to Hive Table

Hi Nirmal,

This error is resolved, by creating folder by name of your login under 'user' directory on hdfs.

e.g. if you are using username as 'ntera', then create a folder '/user/ntera/warehouse' and /user/ntera-hive/warehouse'