Presto Workers unable to connect to Coordinator's Discovery UI

Presto
Teradata Employee

Presto Workers unable to connect to Coordinator's Discovery UI

Well, after resolve the last GSS issue, and attempting to use Presto to connect to Hive, I get the following:

+ java -Dsun.security.krb5.debug=true -jar presto-cli-0.167-t.0.2-executable.jar --server https://master1:7778 --enable-authentication --krb5-config-path /etc/krb5.conf --krb5-principal AdmUser@EXAMPLE.COM --krb5-keytab-path ./my.keytab --krb5-remote-service-name presto --keystore-path /opt/cloudera/security/jks/hdp.keystore --keystore-password password --catalog hive --schema default
presto:default> show tables;
Error running command:
javax.net.ssl.SSLHandshakeException: General SSLEngine problem
 
presto:default>
 
Now my first thought would be to verify that the config.properties file for both the coordinator and workers were properly setup for SSL:
Master 1’s config.properties
coordinator=true
discovery-server.enabled=true
discovery.uri=https://master1:7778
http-server.authentication.type=KERBEROS
http-server.http.enabled=false
http-server.https.enabled=true
http-server.https.keystore.key=password
http-server.https.keystore.path=/opt/cloudera/security/jks/hdp.keystore
http-server.https.port=7778
http.authentication.krb5.config=/etc/krb5.conf
http.server.authentication.krb5.keytab=/etc/presto/presto.keytab
http.server.authentication.krb5.service-name=presto
internal-communication.authentication.kerberos.enabled=true
internal-communication.authentication.krb5.config=/etc/krb5.conf
internal-communication.authentication.krb5.keytab=/etc/presto/presto.keytab
internal-communication.authentication.krb5.principal=presto@EXAMPLE.COM
internal-communication.authentication.krb5.service-name=presto
internal-communication.https.keystore.key=password
internal-communication.https.keystore.path=/opt/cloudera/security/jks/hdp.keystore
internal-communication.https.required=true
node-scheduler.include-coordinator=false
node.internal-address=master1.fqdn
query.max-memory-per-node=8GB
query.max-memory=50GB
 
Worker1..3 config.properties
coordinator=false
discovery.uri=https://master1:7778
http-server.https.port=7778
query.max-memory-per-node=8GB
query.max-memory=50GB
 
Check Presto Status:
[AdmUser@master1 ~]$ ~/presto_server_pkg.0.167-t.0.2/prestoadmin/presto-admin server status -p password
Server Status:
master1(IP: Unknown, Roles: coordinator): Running
No information available: unable to query coordinator
Server Status:
worker1(IP: Unknown, Roles: worker): Running
No information available: unable to query coordinator
Server Status:
worker2(IP: Unknown, Roles: worker): Running
No information available: unable to query coordinator
Server Status:
worker3(IP: Unknown, Roles: worker): Running
No information available: unable to query coordinator
 
Then I attempted to see what was happening when Presto started up:
 
Error and Warnings Messages on Worker1 during Starting of Presto:
2017-06-13T14:46:14.982Z WARN main io.airlift.jmx.JmxAgent Cannot determine if JMX agent is already running (not an Oracle JVM?). Will try to start it manually.
2017-06-13T14:46:15.868Z ERROR Discovery-0 io.airlift.discovery.client.CachingServiceSelector Cannot connect to discovery server for refresh (presto/general): Lookup of presto failed for https://master1:7778/v1/service/presto/general
2017-06-13T14:46:16.154Z ERROR Discovery-1 io.airlift.discovery.client.CachingServiceSelector Cannot connect to discovery server for refresh (collector/general): Lookup of collector failed for https://master1:7778/v1/service/collector/general
2017-06-13T14:46:16.313Z WARN main org.eclipse.jetty.server.Server THIS IS NOT A STABLE RELEASE! DO NOT USE IN PRODUCTION!
2017-06-13T14:46:16.313Z WARN main org.eclipse.jetty.server.Server Download a stable release from http://download.eclipse.org/jetty/
2017-06-13T14:46:16.319Z WARN main org.eclipse.jetty.server.handler.AbstractHandler No Server set for org.eclipse.jetty.server.handler.ErrorHandler@6abdec0e
2017-06-13T14:46:20.066Z ERROR Announcer-0 io.airlift.discovery.client.Announcer Cannot connect to discovery server for announce: Announcement failed for https://master1:7778
2017-06-13T14:46:20.067Z ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 61.66ms. Next request will happen within 0.00s
2017-06-13T14:46:20.071Z ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 1.66ms. Next request will happen within 1.00ms
2017-06-13T14:46:20.075Z ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 1.35ms. Next request will happen within 2.00ms
2017-06-13T14:46:20.081Z ERROR Announcer-3 io.airlift.discovery.client.Announcer Service announcement failed after 1.52ms. Next request will happen within 4.00ms
2017-06-13T14:46:20.091Z ERROR Announcer-4 io.airlift.discovery.client.Announcer Service announcement failed after 1.33ms. Next request will happen within 8.00ms
2017-06-13T14:46:20.110Z ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 2.80ms. Next request will happen within 16.00ms
2017-06-13T14:46:20.144Z ERROR Announcer-3 io.airlift.discovery.client.Announcer Service announcement failed after 1.47ms. Next request will happen within 32.00ms
2017-06-13T14:46:20.210Z ERROR Announcer-4 io.airlift.discovery.client.Announcer Service announcement failed after 1.55ms. Next request will happen within 64.00ms
2017-06-13T14:46:20.340Z ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 1.81ms. Next request will happen within 128.00ms
2017-06-13T14:46:20.599Z ERROR Announcer-3 io.airlift.discovery.client.Announcer Service announcement failed after 2.10ms. Next request will happen within 256.00ms
2017-06-13T14:46:21.116Z ERROR Announcer-4 io.airlift.discovery.client.Announcer Service announcement failed after 5.37ms. Next request will happen within 512.00ms
2017-06-13T14:46:22.121Z ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 4.33ms. Next request will happen within 1000.00ms
...
 
Error and Warning Messages on Master1 during Starting of Presto:
2017-06-13T14:46:18.598Z WARN main io.airlift.jmx.JmxAgent Cannot determine if JMX agent is already running (not an Oracle JVM?). Will try to start it manually.
2017-06-13T14:46:18.727Z ERROR Discovery-0 io.airlift.discovery.client.CachingServiceSelector Cannot connect to discovery server for refresh (presto/general): Lookup of presto failed for https://master1:7778/v1/service/presto/general
2017-06-13T14:46:19.758Z WARN http-client-shared-27 com.facebook.presto.metadata.RemoteNodeState Error fetching node state from https://master1:7778/v1/info/state: Server refused connection: https://master1:7778/v1/info/state
2017-06-13T14:46:19.962Z ERROR Discovery-1 io.airlift.discovery.client.CachingServiceSelector Cannot connect to discovery server for refresh (collector/general): Lookup of collector failed for https://master1:7778/v1/service/collector/general
2017-06-13T14:46:20.084Z WARN main org.eclipse.jetty.server.Server THIS IS NOT A STABLE RELEASE! DO NOT USE IN PRODUCTION!
2017-06-13T14:46:20.084Z WARN main org.eclipse.jetty.server.Server Download a stable release from http://download.eclipse.org/jetty/
2017-06-13T14:46:20.090Z WARN main org.eclipse.jetty.server.handler.AbstractHandler No Server set for org.eclipse.jetty.server.handler.ErrorHandler@77f4c040
 
I was able to access the web interface for Presto via https://master1:7778 (but it was obvious that it was not able to communicate to the workers).
 
Tested connection from worker node to coordinator discovery UI and verify port and process on Master1
Connection from worker1 to master1:7778
[AdmUser@worker1 ~]$ telnet master1 7778
Trying master1...
Connected to master1.
Escape character is '^]'.
^]
 
telnet> close
Connection closed.
 
Checking Master1 for Port 7778 and process listening on that Port
[AdmUser@master1 ~]$ sudo netstat -ntlp | grep 7778
tcp        0      0 0.0.0.0:7778            0.0.0.0:*               LISTEN      32156/java
[AdmUser@master1 ~]$ ps aux | grep 32156
presto   32156  4.1  1.4 27902440 1718384 ?    Ssl  Jun12  55:08 java -cp /usr/lib/presto/lib/* -server -Xmx16G -XX:-UseBiasedLocking -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:+ExplicitGCInvokesConcurrent -XX:+HeapDumpOnOutOfMemoryError -XX:+UseGCOverheadLimit -XX:OnOutOfMemoryError=kill -9 %p -XX:ReservedCodeCacheSize=512M -DHADOOP_USER_NAME=hive -Dsun.security.krb5.debug=true -Dlog.enable-console=true -Dnode.launcher-log-file=/var/log/presto/launcher.log -Dnode.environment=presto -Dlog.enable-console=false -Dlog.output-file=/var/log/presto/server.log -Dplugin.dir=/usr/lib/presto/lib/plugin -Dnode.server-log-file=/var/log/presto/server.log -Dcatalog.config-dir=/etc/presto/catalog -Dconfig=/etc/presto/config.properties -Dnode.data-dir=/var/lib/presto/data -Dnode.id=9324c4b8-c4ef-4629-9a1d-25354e1c7220 com.facebook.presto.server.PrestoServer
 
I've now run out of ideas regarding the underlying issue, and I could use some additional insight as to the issue.  If you need any additional information regarding the Presto configuration, let me know.