We're integrating data from Teradata and Hadoop using Presto.
AS per open source presto's flow says,to run a unified query we need to conenct Presto client then Presto co-ordinator manage to read data from Data Source 1 and DataSource2 and so on
Is the flow same in Teradata's Presto version, if my one of datasource is Teradata ?
Is there anyway we can hit the query from TD box and it will collect data from TD tables and Hadoop files?
Please ingone if something silli and I'm unable to explain well, I'm very new to Teradata and so Presto as well.
HN, I am trying to understand exactly what you are asking. Are you attempting to query from Presto to Teradata? If so you will need the QueryGrid connector for Presto. The QG connector allows you to query from TD to Presto and from Presto to TD. I have a description of the QG connector in this brochurehttp://www.teradata.com/Resources/Datasheets/QueryGrid-and-Presto-Enabling-faster-more-scalable-interactive... With the QG connector for Presto you can execute a single query that joins data from Hadoop and Teradata .i You can initiate that query from either Teradata or Presto within the QG archatecture. Teradata supports connectivity between TD and Presto only through QueryGrid, since it is optimized and fully parallel . Does that answer your question?
I want to run query from Teradata to Presto so that analytical applications earlier using TD's data, now can fetch data from Hadoop and join with TD tables to give more insights and meaning to the data.
Also If I can query from TD then application which were using TD views earlier, will we functioning on the same way. Only I have to create new view (updated view) which have joins/union between TD and Hadoop both.
I read the above link and I believe QG with Presto will solve this problem. Please let me know if my understanding is correct.
Would it be possible for you to share some installation related information for the whole setup. Like
> Where to install Presto Server (Presto coordinator), in TD box or in Hadoop cluster
> Criteria to decide how many worker need to be install. Is it the same as total no. of Hadoop Data Nodes?
> Best place to install a Worker, is it in all Data Nodes so that we'll achieve data locality or we can install in any machine?
> Any performance degrade on current TD box after installing QG in it?
Yes Harihar, you are correct, you would need to purchase the QueryGrid connector for Presto to be able to do what you want. If you would like to have a deeper discussion around QueryGrid and Presto specifics and configuration I can set up a call so we can discuss. Please e-mail me at firstname.lastname@example.org