I tried to setup connection for Hive (HDP) using Teradata Studio, while setting up connection profile,TD Studio asks for JDBC and WebHCat information and forces the connection info for both to be setup.
I am curious to know the purpose for both the connection types? If I can connect using JDBC then why does WebHCat information is required? and vice versa. Would it be good if user has choice to select any one? or disable one of the mode of connectivity?
Currently, the DSE and Navigator are populated by WebHCat and JDBC is used for running queries. Getting rid of WebHCat dependencies if JDBC credentials are provided is on our roadmap for this year.
Studio uses webHcat to gather the catalog information on the Hadoop database and table objects for the Data Source Explorer. We have a JIRA opened to use HiveQL to gather that information. We refer to the HiveQL connection as JDBC.
Hi, I'm just wondering of the removal of the dependency on WebHCat is still on the agenda? It's unfortunately proving to be somewhat of a show-stopper in the environment I am working in :(
Yes, we've removed the WebHCat dependency in Studio 16.20, which is targeted for release in the next week.
I've performed a fresh installation of Teradata Studio 16.20 (on a Mac), and am attempting to establish a Hortonworks / Knox Gateway connection. It's still trying to connect to WebHCat during the connection to the Hive server. It seems the dependency on WebHCat is still there?
Edit: I changed the configuration of the connection to not select Knox Gateway (so, just Hortonworks). Then I selected the Hive connection service, supplied the connection details for Knox in the JDBC parameters... and BAM! Everything is working!!!!! :D
This is awesome, it's like the holiday season has arrived early! ;) Thank you!