Optimization of a Join Query

Database
Enthusiast

Optimization of a Join Query

Hi,

I've a join query statement in Teradata which I'm trying to optimize inorder to create a report. This query runs on a daily basis. There are 2 parts in the query (by two parts, I means two subqueries). Both subqueries are the same except for one date condition which is different in them. During business hours, this query as a whole takes about ~15 mins to run (I arrived at this time by calculating the individual run times of both the subqueries separately) and it is captured & aborted by the admin / SQL team. During off-business hours, this query runs faster within 2 or 3 mins.

Is there a way to optimize this query to run it during business hours ? Can the subqueries be combined to avoid the Left Outer Join? I'm using Teradata 13.11. The structure of the query is as follows :

Select X.field1, X.field2, X.field3, X.field4, …
Case when Z.field1 is null then ‘Y’
Else ‘N’
End as Flag

from
(
Select A.field1, A.field2, B.field3, C.field4, …
from table1 A
left join
(select field1, field2, … from Table_B
where date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
) as B
on A.field1 = B.field1 and A.date_field = B.date_field

left join
(select field1, field2, … from Table_C
where date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
) as C
on A.field1 = C.field1 and A.date_field = C.date_field

left join
(select field1, field2, … from Table_D
where date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
qualify over ….
) as D
on A.field1 = D.field1 and A.date_field = D.date_field

where a.date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
and <condition_1>
and <condition_2>
and <condition_3>
and <condition_4>
and <condition_5>
) as X

left join
(
Select A.field1
From table1 A
left join
(select field1, field2, … from Table_B
where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
) as B
on A.field1 = B.field1 and A.date_field = B.date_field

left join
(select field1, field2, … from Table_C
where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
) as C
on A.field1 = C.field1 and A.date_field = C.date_field

left join
(select field1, field2, … from Table_D
where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
qualify over ….
) as D
on A.field1 = D.field1 and A.date_field = D.date_field

where a.date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
and <condition_1>
and <condition_2>
and <condition_3>
and <condition_4>
and <condition_5>
) as Z

on X.field1 = Z.field1
6 REPLIES
Enthusiast

Re: Optimization of a Join Query

It 's difficult to say without explain plan. Is this just a query or a JI exists.... stats collected ....partitioning...PIs info.... and many more to look into.

Enthusiast

Re: Optimization of a Join Query

I'll try to post the explain plan information. All I can say now is that the data is fetched from views which fetches huge volumes of data if the date field is not used.

How can this query be modified? Can someone help me out?

Junior Contributor

Re: Optimization of a Join Query

Hi Sarang,

seems like you only want a flag to detemine if data from the previous week exists.

You could extract rows for both dates in a single query and then use an OLAP function to check if there's a row a week ago.

SELECT
...
-- check if rows from both weeks exist
CASE WHEN COUNT(*)
OVER (PARTITION BY field1) = 1
THEN 'N'
ELSE 'Y'
end
FROM
( SELECT -- returns two rows if both dates exist
date_field
...
where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end
or date_field = case when day_of_week(date) = 2 then date-3 else date -1 end
) AS dt
QUALIFY -- only rows from the current week
date_field = case when day_of_week(date) = 2 then date-3 else date -1 end

As Raja said DDL plus Explain and/or data from QryLogStepsV might help.

Enthusiast

Re: Optimization of a Join Query

Hi Dieter,

Thanks for your logic. In the tables that are used in my original query, the fields from the subquery Z are extracted based on the below date condition along with other conditions which are applied before the flag value is calculated.

where date_field = case when day_of_week(date) = 2 then date-10 else date -8 end

Since the same set of tables are used in both the subqueries X and Z, field1 (which is applied in the join condition) will exist in both subqueries and the data extracted for the previous week may or may not occur based on the other conditions applied.

This being the case, how can the flag be calculated without the join being applied ?

Enthusiast

Re: Optimization of a Join Query

Here's the original query structure that is use, but for privacy reason, I've masked the real field names...

SELECT 
Current.date_field, Current.field1, Current. field2, Current. field3, Current. field4,
Current. field5, Current. field6, Current. field7, Current. field8, Current. field9, Current. field10,
Current. field11, Current. field12, Current. field13, Current. field14, Current. field15, Current. field16,
CASE
WHEN Previous.field1 IS NULL THEN 'Y'
WHEN Previous.field1 IS NOT NULL THEN 'N'
ELSE NULL
END AS flag,
(Current.date_field -Current. date_field2) AS calc_field2,
CASE
WHEN flag = 'N' THEN
CASE WHEN calc_field2 > 90 THEN ‘Red’
ELSE calc_field2 between 85 AND 90 THEN 'Yellow'
ELSE ‘Green’
END
WHEN flag = 'Y' THEN
CASE
WHEN calc_field2 >= 85 THEN ‘Yellow’
ELSE 'Green'
END
END AS calc_field3

FROM
(
SELECT A.date_field, A.field1, A.date_field2, A.field3, A. field4, A. field5, G.field6, H.field7, I.field1, I.field2, I.field3, I.field4, J.field1, F.field2, F.field3, F. field4, F. field5, F. field6, F. field7, D.field2 FROM
A1 AS A
LEFT JOIN (
SELECT field1, field2, field3, field4, field5, field6, field7 FROM B1
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, field2, field3, field4, field5, field6 ORDER BY field7) = 1 ) AS F
ON A. field7 = F. field1
LEFT JOIN (
SELECT * FROM C1
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, date_field ORDER BY field3) =1 ) AS D
ON A.field1 = D.field1 AND A.date_field = D.date_field
LEFT JOIN
G1 AS G
ON A.field1 = G.field1 AND A.date_field = G.date_field
LEFT JOIN
H1 AS H
ON A.field1 = H.field1
LEFT JOIN
I1 AS I
ON A.field1 = I.field1
LEFT JOIN
J1 AS J
ON A.field1 = J.field1
WHERE
A.date_field =
CASE
WHEN day_of_week(DATE) = 2 THEN DATE - 3
ELSE DATE - 1
END
AND field4 > 0
AND COALESCE(A.field10,'') NOT IN ('A')
AND F.field2 IN ('Arnold Schwarzenegger')
AND (A.date_field – A.date_field2) <= 60 --Not over 60 days
AND A.field1 IN (
SELECT field1 FROM (
SELECT * FROM (
SELECT field1, field2, field3, field4, date_field FROM X WHERE
date_field = ( CASE WHEN day_of_week(DATE) = 2 THEN DATE - 3 ELSE DATE - 1 END )
AND field2 IN ('RECRCV', 'RECFCL', 'RECRTN')
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, field2 ORDER BY field3 DESC, field4 DESC) =1
) AS BA
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1 ORDER BY field3 DESC, field4 DESC) = 1 ) AS BAA
WHERE field 2IN ('RECRCV') )
) AS Current

LEFT JOIN

(
SELECT A.date_field, A.field1, A.date_field2, A.field3, A. field4, A. field5, G.field6, H.field7, I.field1, I.field2, I.field3, I.field4, J.field1, F.field2, F.field3, F. field4, F. field5, F. field6, F. field7, D.field2 FROM
A1 AS A
LEFT JOIN (
SELECT field1, field2, field3, field4, field5, field6, field7 FROM B1
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, field2, field3, field4, field5, field6 ORDER BY field7) = 1 ) AS F
ON A. field7 = F. field1
LEFT JOIN (
SELECT * FROM C1
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, date_field ORDER BY field3) =1 ) AS D
ON A.field1 = D.field1 AND A.date_field = D.date_field
LEFT JOIN
G1 AS G
ON A.field1 = G.field1 AND A.date_field = G.date_field
LEFT JOIN
H1 AS H
ON A.field1 = H.field1
LEFT JOIN
I1 AS I
ON A.field1 = I.field1
LEFT JOIN
J1 AS J
ON A.field1 = J.field1
WHERE
A.date_field =
CASE
WHEN day_of_week(DATE) = 2 THEN DATE - 10
ELSE DATE - 8
END
AND field4 > 0
AND COALESCE(A.field10,'') NOT IN ('A')
AND F.field2 IN ('Arnold Schwarzenegger ')
AND (A.date_field – A.date_field2) <= 60 --Not over 60 days
AND A.field1 IN (
SELECT field1 FROM (
SELECT * FROM (
SELECT field1, field2, field3, field4, date_field FROM X WHERE
date_field = ( CASE WHEN day_of_week(DATE) = 2 THEN DATE - 10 ELSE DATE - 8 END )
AND field2 IN ('RECRCV', 'RECFCL', 'RECRTN')
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1, field2 ORDER BY field3 DESC, field4 DESC) =1
) AS BA
QUALIFY ROW_NUMBER() OVER (PARTITION BY field1 ORDER BY field3 DESC, field4 DESC) = 1 ) AS BAA
WHERE field 2IN ('RECRCV') )
) AS Previous

ON Current.field1 = Previous.field1 ;
Junior Contributor

Re: Optimization of a Join Query

As both Derived Table use exactly the same conditions (besides dat_col) you can use the approach I showed. Fold both Derived Tables into one using an or-ed condition on date_col and add the date_col to your QUALIFY to get both rows:

QUALIFY ROW_NUMBER()
OVER (PARTITION BY field1, date_col
ORDER BY field3 DESC, field4 DESC) = 1 ) AS BAA