Query needs to run faster

UDA
Enthusiast

Query needs to run faster

Hello ..

I am trying to get one of my queries to run faster.

Table 1 : Contains Key 1, which will be unique, about 800,000 rows. The table contains 2 other columns, which are not being used in the join below.

Table 2 : Contains 6 columns, and is keyed on Key 1 (different key from Table1.Key1) and a date. This table contains Key 3 (which is the same as Table1.Key1), but is not the primary key. I need to extract 5 years of data from Table 2, where Table2.Key3 = Table1.Key1.

To make life easier, I extract 5 years of data from the production table, which contains many more columns) into a temp table. This brought in about 40M rows in about 20 seconds. I created Indexes on Table 1 and table 2 on matching keys and collected stats before and after loading the data, before the join.

The next step is to join Table 1 and Table 2, and extract only those rows that contain Table1.Key1 in Table2.Key3.

INSERT
INTO TABLE 3 (which is an empty table)
SELECT columns from Table 2
FROM Table 2
WHERE Table 2.Key1 = Table2.Key1
AND Table 2.Date = Table2.Date
AND Table 2.Key3 IN (
SELECT Key1
FROM Table 1
GROUP BY 1 )
GROUP BY 01, 02, 03, 04

The above query runs for about 50-60 minutes, and the Explains says it should run in about 35 seconds.
1. Is it the best way to write the query?
2. Is it not executing the sub-query for each row from table 2?
3. Is there not a way to write the query in such a way that the subquery executes just once for all rows in table 2?
4. Any other suggestions?

Thanks ..

2 REPLIES
Senior Apprentice

Re: Query needs to run faster

Hi John,
your narrative and your query are a bit confusing to me.
Could you please post the actual query, explain and some info about PIs and Statistics.

There should be no need for a temp table. And Secondary Indexes will not be used for that kind of query query. You praobably worked with OLTP databases before (Oracle/MS?)

"1. Is it the best way to write the query?"
No, because table1.key1 is already unique and you force Teradata to to an aggregation.

"2. Is it not executing the sub-query for each row from table 2?"
No, Teradata rewrites every query to some kind of join.

"3. Is there not a way to write the query in such a way that the subquery executes just once for all rows in table 2?"
No, but why would you want that, it would be extremely slow?

"4. Any other suggestions?"
It looks like a wrong PI of the target table or an unwanted crossjoin.

Dieter
Enthusiast

Re: Query needs to run faster

Does your explain show a product join/cross join?
Also the actual SQL Stmt and the actual explain plan would be beneficial.
Have any of the tables in the select statement got no data in them?
Are your stats correct and will the optimiser see the best explain plan given the stats provided?