Volatile Table In View ??

Database
Enthusiast

Volatile Table In View ??

Hi,

I am writing a view within which i have some restriction to apply.

Following is my current working query :

REPLACE VIEW person_view
AS LOCKING ROW FOR ACCESS

SELECT distinct
date_time ,
person_id ,
person_key ,
name ,
address

FROM _person a

inner join

(select max(date_time) as max_date_time , person_id as max_person_id , person_key as max_person_key
from _person group by max_person_id , max_person_key) b

on a.date_time = b.max_date_time
and a.person_id = b.max_person_id
and a.person_key = b.max_person_key;

The above query works but i am not sure about:

  1. Performance of the query ?
  2. Is there a possibility to create a volatile table. It should be better as its indexed.
  3. How does TERADATA handle the inner table ?

Appreciate your inputs.

2 REPLIES
Senior Apprentice

Re: Volatile Table In View ??

#1: Those MAX/MIN queries can usually be rewritten using RANK/ROW_NUMBER:

SELECT *
FROM _person
QUALIFY
RANK()
OVER (PARTITION BY person_id, person_key
ORDER BY date_time DESC) = 1

In most cases this is (much) faster, only in some specific cases when the GROUP BY/JOIN is the PI and you rewrite it as a Correlated Subquery it might be a bit worse. I always prefer the OLAP version as it's also much easier to write.

#2: You can't create a Volatile Table within a View. You could something similar in a MACRO: INSERT/SELECT into a Global Temporary Table and then use it.

#3: See Explain. It's materializing the result of the Derived Table. This is similar to a CREATE VOLATILE TABLE, that's why you usually don't care about it. Just let the optimizer do it's work.

Dieter

Enthusiast

Re: Volatile Table In View ??

Hi,

Thanks for the inputs. It helped me resolve the issue. However, i would like to add the following:

I tried the 

SELECT *
FROM _person
QUALIFY
RANK()
OVER (PARTITION BY person_id, person_key
ORDER BY date_time DESC) = 1

option as well as the following :

where 
a.date_time = ( select max(b.date_time)
from _person b
where a.person_id = b.person_id
and a.person_key = b.person_key )

Though both are giving response times that are acceptable for me; the IO & Spool Usage by the former query is higher.

Just wanted to know the reason.