Query running for 3 hours to get the result

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Enthusiast

Query running for 3 hours to get the result

Hi Team,

 

Can anyone please check the below query,  to prove me the better solution... as the query is taking 3 hours to get the result (its a history table.. has huge data)

 

 

SELECT C1
|| ','
|| C2
|| ','
|| C3
|| ','
|| C4
|| ','
|| C5
|| ','
|| CASE
WHEN grp_id IN (3)
THEN prod_strength
WHEN grp_id IN (4)
THEN prod_strength
ELSE '0'
END
|| ','
|| src_id
|| ','
|| filename
|| ','
|| ','
|| ','
|| CASE
WHEN TO_NUMBER (qty) < 0
THEN '-'
ELSE '+'
END AS col_values,
land_id AS unq_val
FROM HIST_TABLE
WHERE (NVL (C1, '0'),
NVL (C2, '0'),
NVL (C3, '0'),
NVL (C4, '0'),
NVL (C5, '0'),
NVL (CASE
WHEN grp_id IN (3)
THEN prod_strength
WHEN grp_id IN (4)
THEN prod_strength
ELSE '0'
END,
'0'
),
NVL (src_id, '0'),
NVL (filename, '0'),
NVL (CASE
WHEN TO_NUMBER (qty) < 0
THEN '-'
ELSE '+'
END, 0)
) IN (
SELECT NVL (C1, '0'), NVL (C2, '0'),
NVL (C3, '0'), NVL (C4, '0'),
NVL (C5, '0'),
NVL (CASE
WHEN grp_id IN (3)
THEN prod_strength
WHEN grp_id IN (4)
THEN prod_strength
ELSE '0'
END,
'0'
),
NVL (src_id, '0'), NVL (filename, '0'),
NVL (CASE
WHEN TO_NUMBER (qty) < 0
THEN '-'
ELSE '+'
END, 0)
FROM HIST_TABLE
GROUP BY NVL (C1, '0'),
NVL (C2, '0'),
NVL (C3, '0'),
NVL (C4, '0'),
NVL (C5, '0'),
NVL (CASE
WHEN grp_id IN (3)
THEN prod_strength
WHEN grp_id IN (4)
THEN prod_strength
ELSE '0'
END,
'0'
),
NVL (src_id, '0'),
NVL (filename, '0'),
NVL (CASE
WHEN TO_NUMBER (qty) < 0
THEN '-'
ELSE '+'
END, 0)
HAVING COUNT (*) > 1)

2 REPLIES
Teradata Employee

Re: Query running for 3 hours to get the result

First thought: instead of the IN condition, do INNER JOIN to a derived table subquery, e.g. instead of

SELECT ... FROM tbl WHERE (A, ..., Z) IN (SELECT A, ..., Z FROM tbl GROUP BY A, ..., Z HAVING COUNT(*) > 1)

try

SELECT ... FROM tbl t1 INNER JOIN (SELECT A, ..., Z FROM tbl GROUP BY A, ..., Z HAVING COUNT(*) > 1) AS t2

ON t1.A = t2.A AND ... AND t1.Z = t2.Z

 

In your case, A, ..., Z would sometimes be CASE or NVL/COALESCE expressions instead of simple column references, e.g. NVL(t1.A,'0') = NVL(t2.A,'0').

Junior Contributor

Re: Query running for 3 hours to get the result

Who wrote that?

This is checking for duplicate across those columns (count(*) > 1) and then returns those duplicates n times.

 

And all those NVLs (probably due to the IN), if you actually got NULLs the concatenated col_values will be NULL, too.

 

Anyway, you should be able to simplify it using a GROUP COUNT like this:

SELECT C1
|| ','
|| C2
|| ','
|| C3
|| ','
|| C4
|| ','
|| C5
|| ','
|| CASE
WHEN grp_id IN (3)
THEN prod_strength
WHEN grp_id IN (4)
THEN prod_strength
ELSE '0'
END
|| ','
|| src_id
|| ','
|| filename
|| ','
|| ','
|| ','
|| CASE
WHEN TO_NUMBER (qty) < 0
THEN '-'
ELSE '+'
END AS col_values,
land_id AS unq_val
FROM HIST_TABLE
QUALIFY Count(*) Over (PARTITION BY col_values) > 1

 

It's probably more efficient to nest the COUNT OVER on the base columns (without NVL) in a Derived Table and concatenate only the resulting rows.