Connecting Aster Express 5.0.0 and Hadoop 2.0 VMWare and loading data with load_from_hcatalog

Aster
Teradata Employee

Connecting Aster Express 5.0.0 and Hadoop 2.0 VMWare and loading data with load_from_hcatalog

I am trying to load data with the following statement:

SELECT * FROM load_from_hcatalog (on mr_driver server ('hadoop') username ('hue') dbname ('default') tablename ('sample_07') columns ('code', 'description', 'total_emp', 'salary' )) limit 5;

I get the following error:

ERROR:  SQL-MR function LOAD_FROM_HCATALOG failed: Failed to read data from hcatalog. TaskIndex : 0. Details : Server IPC version 9 cannot communicate with client version 4

Does anyone know what to do next? I have googled this error and it seems to me that the Aster VMware is using an out of date package, but I am not sure where and how to install what package for linux.

Other information about my setup:

I use a computer with 8 GB of RAM, capable of running the Teradata 14.1 SLES 11 VMware + Aster Queen + Aster Worker + Hadoop 2.0 Vmware at the same time. It works fine, I can do all the tutorials. Also, on a side not, I was able to successfully load data from Teradata with the load_from_teradata function in Aster.

I use the Aster express version 5.0.0 VMWare and the Hortonworks+Sandbox+2.0+VMware option. I put a extra network adapter and made that adapter so that it is in the same range as the Aster Queen and Aster Worker. I have changed the hosts files on both queen and worker so they can reach the Hadoop server.

I have checked the connectivity between Hadoop and Aster with the command \extd host=hadoop in ACT and that showed me the list of tables which were on Hadoop. I can ping the Hadoop server as well as I can ping the Aster Worker and Queen from Hadoop. So there is a connection.

Full log of the process:

Output from 192.168.100.150

=====================================

[SQL-MR] Start of stdout

time(msec) to get hcat schema 1389

[SQL-MR] End of stdout

[SQL-MR] Start of stderr

[SQL-MR] Starting construction of SQL-MR function LOAD_FROM_HCATALOG at Wed Jan 15 10:59:29 EST 2014...

14/01/15 10:59:30 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH

14/01/15 10:59:30 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoop:9083

14/01/15 10:59:30 INFO hive.metastore: Connected to metastore.

[SQL-MR] Construction of function ended in SUCCESS after 1.526 seconds.

[SQL-MR] End of stderr

Output from 192.168.100.150

=====================================

[SQL-MR] Start of stdout

time(msec) to get hcat schema 3432

taskCount 2

taskIndex 1

Exception in drainOutputRows : java.io.IOException: Error in RPC response header: AppError

java.io.IOException: Error in RPC response header: AppError

at com.asterdata.aster.net.RpcClientOnHttp.readResponse(RpcClientOnHttp.java:179)

at com.asterdata.aster.net.RpcClientOnHttp.invokeRequest(RpcClientOnHttp.java:106)

at com.asterdata.aster.net.RpcClientOnHttp.invokeRequest(RpcClientOnHttp.java:91)

at com.asterdata.aster.mailman.MailmanClient.readMessage(MailmanClient.java:244)

at load_from_hcatalog.getHCatSplitsFromMaster(load_from_hcatalog.java:913)

at load_from_hcatalog.getHCatSplitsForTask(load_from_hcatalog.java:974)

at load_from_hcatalog.drainOutputRows(load_from_hcatalog.java:470)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runOperatingTask(SwigRunner.java:373)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runTask(SwigRunner.java:295)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.run(SwigRunner.java:125)

at com.asterdata.ncluster.sqlmr.internal.FunctionThread.runFunction(FunctionThread.java:137)

at com.asterdata.ncluster.sqlmr.internal.FunctionThread.run(FunctionThread.java:64)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:619)

task 1 time for entire execution 582

[SQL-MR] End of stdout

[SQL-MR] Start of stderr [SQL-MR] Starting construction of SQL-MR function LOAD_FROM_HCATALOG at Wed Jan 15 10:59:34 EST 2014... 14/01/15 10:59:36 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH 14/01/15 10:59:36 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoop:9083 14/01/15 10:59:37 INFO hive.metastore: Connected to metastore. [SQL-MR] Construction of function ended in SUCCESS after 3.535 seconds. [SQL-MR] Starting execution of SQL-MR function at Wed Jan 15 10:59:37 EST 2014... [SQL-MR] Execution ended in FAILURE after 0.644 seconds. SQL-MR function failed: com.asterdata.ncluster.sqlmr.ClientVisibleException: Failed to read data from hcatalog. TaskIndex : 1. Details : Error in RPC response header: AppError at load_from_hcatalog.drainOutputRows(load_from_hcatalog.java:579) at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runOperatingTask(SwigRunner.java:373) at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runTask(SwigRunner.java:295) at com.asterdata.ncluster.sqlmr.internal.SwigRunner.run(SwigRunner.java:125) at com.asterdata.ncluster.sqlmr.internal.FunctionThread.runFunction(FunctionThread.java:137) at com.asterdata.ncluster.sqlmr.internal.FunctionThread.run(FunctionThread.java:64) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)[SQL-MR] End of stderr

Output from 192.168.100.150

=====================================

[SQL-MR] Start of stdout

time(msec) to get hcat schema 3025

taskCount 2

taskIndex 0

Exception in drainOutputRows : org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4

org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4

at org.apache.hadoop.ipc.Client.call(Client.java:1092)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)

at $Proxy9.getProtocolVersion(Unknown Source)

at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)

at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:120)

at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:321)

at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:286)

at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)

at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)

at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)

at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)

at org.apache.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:152)

at load_from_hcatalog.getAndDistributeHCatSplits(load_from_hcatalog.java:749)

at load_from_hcatalog.getHCatSplitsForTask(load_from_hcatalog.java:972)

at load_from_hcatalog.drainOutputRows(load_from_hcatalog.java:470)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runOperatingTask(SwigRunner.java:373)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runTask(SwigRunner.java:295)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.run(SwigRunner.java:125)

at com.asterdata.ncluster.sqlmr.internal.FunctionThread.runFunction(FunctionThread.java:137)

at com.asterdata.ncluster.sqlmr.internal.FunctionThread.run(FunctionThread.java:64)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:619)

task 0 time for entire execution 578

Destroying the bus

[SQL-MR] End of stdout

[SQL-MR] Start of stderr

[SQL-MR] Starting construction of SQL-MR function LOAD_FROM_HCATALOG at Wed Jan 15 10:59:34 EST 2014...

14/01/15 10:59:36 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH

14/01/15 10:59:37 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoop:9083

14/01/15 10:59:37 INFO hive.metastore: Connected to metastore.

[SQL-MR] Construction of function ended in SUCCESS after 3.199 seconds.

[SQL-MR] Starting execution of SQL-MR function at Wed Jan 15 10:59:37 EST 2014...

14/01/15 10:59:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

14/01/15 10:59:37 WARN snappy.LoadSnappy: Snappy native library not loaded

[SQL-MR] Execution ended in FAILURE after 0.609 seconds.

SQL-MR function failed: com.asterdata.ncluster.sqlmr.ClientVisibleException: Failed to read data from hcatalog. TaskIndex : 0. Details : Server IPC version 9 cannot communicate with client version 4

at load_from_hcatalog.drainOutputRows(load_from_hcatalog.java:579)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runOperatingTask(SwigRunner.java:373)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.runTask(SwigRunner.java:295)

at com.asterdata.ncluster.sqlmr.internal.SwigRunner.run(SwigRunner.java:125)

at com.asterdata.ncluster.sqlmr.internal.FunctionThread.runFunction(FunctionThread.java:137)

at com.asterdata.ncluster.sqlmr.internal.FunctionThread.run(FunctionThread.java:64)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:619)

[SQL-MR] End of stderr


4 REPLIES
Teradata Employee

Re: Connecting Aster Express 5.0.0 and Hadoop 2.0 VMWare and loading data with load_from_hcatalog

Martin, I'm going to send you an email - let's take this offline.

Re: Connecting Aster Express 5.0.0 and Hadoop 2.0 VMWare and loading data with load_from_hcatalog

Why take it offline? The reason is load_from_hcatalog only works on hadoop 1.3 for the moment.

Teradata Employee

Re: Connecting Aster Express 5.0.0 and Hadoop 2.0 VMWare and loading data with load_from_hcatalog

you can replace all hadoop client jar files under/home/beehive/partner/1.3.2 with your target hadoop envrionment hadoop jar. 

so I work fine.

Enthusiast

Re: Connecting Aster Express 5.0.0 and Hadoop 2.0 VMWare and loading data with load_from_hcatalog

Copying jars to /beehive/partner/hadoop directories on both Aster 6.0 worker and queen does not seems to resolve the issue.Are there any additional modification required to use aster 6.0 VM with HDP 2.2 and above ?

When do we expect aster express vm support for hadoop 2.x distributions?