When installing Presto on EMR I've run into two small issues which combine into a slightly larger problem. First off, Presto requires Java 1.8 build 60 or higher in order to be successfully deployed. This used to be an issue with older EMR versions, but as of the most recent version of EMR (5.3.1) Java 1.8 build 120 is included. Despite this, Presto still requires the Java 1.8 rpm to be deployed in order to successfully installed. Just including the Java rpm would be fine if it were not for the limited root volume memory for EC2 nodes on EMR. From the 10GB default root volume memory, ~4GB is included automatically from AWS. The entire Presto package (including the Java rpm) is ~6GB. Not having to include the Java rpm would bring us under this 10GB threshold. This is only an installation issue as it appears temporarily Python log tables are removed after the server is successfully installed. Does anyone have any insight on how to work on this issue (preferably without having to resize the default root volume for EMR on AWS)?
Presto requires Oracle Java 1.8u92+ RPM; EMR comes with OpenJDK. You could try to install Java first, delete the Java RPM, and then download and install Presto.
Also, how do you get the 6GB for the package size? You only need the server package on EMR, which should be much less than 6GB.