Using Simba's JDBC driver with kerberized Presto co-located on a kerberized Cloudera cluster

Presto
Teradata Employee

Using Simba's JDBC driver with kerberized Presto co-located on a kerberized Cloudera cluster

Hi all,

 

We’re working with aPresto cluster that is co-located with a Kerberized Cloudera cluster. The coordinator has Kerberos authentication in place. They only use the CLI to issue queries so they’ve never tried the JDBC driver. Now we need to use Simba’s JDBC driver to connect to the coordinator.

 

We went about changing config.properties to use a principal named “HTTP” (http://cdn.simba.com/products/Presto/doc/JDBC_InstallGuide/content/jdbc/pr/authenticating/intro.htm) and tried to use a headless keytab. This failed because apparently Presto automatically adds the hostname to the service name when making requests to the KDC. i.e. even though we specify the service-name as HTTP (not HTTP/master_hostname or HTTP/_HOST), the request to the KDC is made as the principal “HTTP/master_hostname”.

 

In Cloudera kerberized clusters, Oozie and WebHCat also use a principal named HTTP on the master node, so it needs keytabs for HTTP/master_hostname. This becomes an issue when restarts occur – Cloudera Manager automatically regenerates the keytabs. The keytab for HTTP/master_hostname gets regenerated so Presto’s copy of the keytab is no longer valid. In addition, the filename of the keytab changes each time because the file name is prefixed with the PID of the service, so a symlink won’t work.

 

Our options as we see them, in order of preference:

 

  1. Make Kerberos request use a headless principal (i.e. just “HTTP”) via configuration – i.e. no code change
  2. Watch the Hadoop service keytab directory to copy the new keytab to a static path specified in config.properties or update a symlink with a static path
  3. Move presto coordinator to different node, so different host name

 

Is #1 possible? Are there other options that are simpler?

 

Thanks,

Jason

1 REPLY
Teradata Employee

Re: Using Simba's JDBC driver with kerberized Presto co-located on a kerberized Cloudera cluster

I am not aware of any configuration properties that would facilitate #1.  Putting the Presto coordinator on a different node sems like the simplest solution to me, unless that creates major headaches around firewall configuration.