Help required on the query using analytic function

Database
Enthusiast

Help required on the query using analytic function

Hi All,

I am trying to create an query using Analytic functions.I have 3 columns where A & B

are primary key columns for grouping and based on the start date time,i need to derive the CALCULATED END TIME.

Scenario 1:

Start time less than (Start time + 23.59 Minutes) then (Start time + 23.59),then the remaining records should have the

same end time until the start time records become greater than the previos record of CALCULATED END TIME.

Scenario 2:

if Start time greater than the previos record of CALCULATED END TIME,then Start time + 23.59 then the remaining records should have the same end time until the start time records become greater than the previos record of CALCULATED END TIME.

  A      B    START TIME        CALCULATED END TIME

101    1    1/1/2010 9:00        (+23:59) 2/1/2010 8:59

101    1    1/1/2010 14:00            2/1/2010 8:59

101    1    1/1/2010 22:00          2/1/2010 8:59

101    1    2/1/2010 7:00           2/1/2010 8:59

101    1    2/1/2010 8:00           2/1/2010 8:59

101    1    2/1/2010 18:00          (+23:59) 3/1/2010 17:59

101    1    2/1/2010 23:30        3/1/2010 17:59

101    1    3/1/2010 11:00        3/1/2010 17:59

101    1    3/1/2010 19:00          (+23:59) 4/1/2010 18:59

101    1    5/1/2010 5:00        (+23:59) 6/1/2010 4:59

101    1    5/1/2012 8:00        (+23:59) 6/1/2010 7:59

I have created some query like this for CALCULATED END TIME,but not getting the desired output:

CASE WHEN START_TIME <= min(START TIME + interval '23:59:59' HOUR TO SECOND) OVER(PARTITION BY A,B ORDER BY START_TIME)

THEN min(START_TIME + interval '23:59:59' HOUR TO SECOND) OVER(PARTITION BY A,B ORDER BY START_TIME

RESET WHEN START_TIME >= min(START TIME + interval '23:59:59' HOUR TO SECOND) OVER(PARTITION BY A,B

ORDER BY START_TIME ROWS BETWEEN 1 FOLLOWING   AND 1 FOLLOWING))

ELSE CASE WHEN START_TIME >= MIN(START TIME + interval '23:59:59' HOUR TO SECOND) OVER (PARTITION BY A,B ORDER BY START_TIME)

THEN MIN(START_TIME + interval '23:59:59' HOUR TO SECOND) OVER (PARTITION BY A,B ORDER BY START_TIME

RESET WHEN START_TIME > min(START_TIME + interval '23:59:59' HOUR TO SECOND) OVER(PARTITION BY A,B

ORDER BY START_TIME ROWS BETWEEN   1 PRECEDING AND CURRENT ROW) ROWS UNBOUNDED PRECEDING)

END

END AS CALCULATED END TIME

Thanks

Bala

7 REPLIES
Senior Apprentice

Re: Help required on the query using analytic function

Hi Bala,

OLAP functions are probably not the right approach (i tried them on a similar problem a few years ago and didn't find an efficient solution), but this can be easily solved (if the number of rows per a/b combination is not too large) using recursion:

CREATE VOLATILE TABLE vt (a NOT NULL, b NOT NULL, rn NOT NULL, start_time NOT NULL) AS 
(
SELECT
a,b, ROW_NUMBER() OVER (PARTITION BY a,b ORDER BY start_time) AS rn, start_time
FROM tab
) WITH DATA UNIQUE PRIMARY INDEX (a,b, rn) ON COMMIT PRESERVE ROWS
;
WITH RECURSIVE cte (rn, a, b, start_time, end_time) AS
(
SELECT rn, a, b, start_time,
start_time + INTERVAL '23:59:59' HOUR TO SECOND AS end_time
FROM vt
WHERE rn=1

UNION ALL

SELECT
vt.rn, vt.a, vt.b, vt.start_time,
CASE WHEN vt.start_time > cte.end_time THEN vt.start_time + INTERVAL '23:59:59' HOUR TO SECOND ELSE cte.end_time END
FROM vt JOIN cte
ON vt.a = cte.a AND vt.b = cte.b AND vt.rn = cte.rn+1
)
SELECT * FROM cte
ORDER BY start_time
;
DROP TABLE vt;

Dieter

Enthusiast

Re: Help required on the query using analytic function

Hi Dieter,

Thanks for your response.I usually follow up your posts.Also will check out your SQL.

Please clarify my queries regading the OLAP functions:

1. As per my knowledge,the reset command within the OLAP function should create the sub partition when evalutes to true.so in my case statement the below reset will take the second min (ie., (+23:59) 3/1/2010 17:59) when it evalutes to true?

RESET WHEN START_TIME >= min(START TIME + interval '23:59:59' HOUR TO SECOND) OVER(PARTITION BY A,B

ORDER BY START_TIME ROWS ROWS BETWEEN 1 FOLLOWING   AND 1 FOLLOWING

2. how does the rows ,rows between (preceding,following,unbounded preceding / following) works?

Thanks,

Bala

Enthusiast

Re: Help required on the query using analytic function

Hi Dieter,

Thanks for the query.Your query works amazing as per the requirement.

I have few questions on that:

1) Recrsion an multiple AMP operation as per the PI we create or an Single AMP Operation like cursor?

2) Also to avoid the infinite looping in the recursive view,can we add an condition to the query or use an recursive hint ?

   ex: OPTION (MAXRECURSION 20)

Query with condition:

----------------------------

WITH RECURSIVE cte (rn, a, b, start_time, end_time)

AS

(

SELECT rn, a, b, start_time,start_time + INTERVAL '23:59:59' HOUR TO SECOND AS end_time

FROM vt

WHERE rn=1  

UNION ALL 

SELECT  vt.rn, vt.a, vt.b,vt.start_time, CASE WHEN vt.start_time > cte.end_time THEN vt.start_time + INTERVAL '23:59:59' HOUR TO SECOND ELSE cte.end_time END FROM

vt

JOIN

cte    

ON vt.a = cte.a

AND vt.b = cte.b

AND vt.rn = cte.rn+1

AND cte.end_time < max(vt.start_time + INTERVAL '24' HOUR)

SELECT * FROM cte ORDER BY start_time ;

Thanks,

Bala

Senior Apprentice

Re: Help required on the query using analytic function

Hi Bala,

#1: recursion is a parallel operation, so it's much faster than a cursor which is processed serially

#2: There's no built-in recursion level limit and there's no option like MAXRECURSION. You have to do it on your own and add a column calculating the level:

WITH RECURSIVE cte (rn, a, b, start_time, end_time, lvl) AS
(
SELECT rn, a, b, start_time,
start_time + INTERVAL '23:59:59' HOUR TO SECOND AS end_time,
CAST(1 AS SMALLINT) AS lvl
FROM vt
WHERE rn=1

UNION ALL

SELECT
vt.rn, vt.a, vt.b, vt.start_time,
CASE WHEN vt.start_time > cte.end_time THEN vt.start_time + INTERVAL '23:59:59' HOUR TO SECOND ELSE cte.end_time END,
lvl + 1
FROM vt JOIN cte
ON vt.a = cte.a AND vt.b = cte.b AND vt.rn = cte.rn+1
WHERE lvl < 20
)
SELECT * FROM cte
ORDER BY start_time;

Dieter

Enthusiast

Re: Help required on the query using analytic function

Thanks a lot Dieter.Your SQL worked as expected.

I have few questions regarding my old query with analytic functions:

1. As per my knowledge,the reset command within the OLAP function should create the sub partition when evalutes to true.so in my case statement the below reset will take the second min (ie., (+23:59) 3/1/2010 17:59) when it evalutes to true?

RESET WHEN START_TIME >= min(START TIME + interval '23:59:59' HOUR TO SECOND) OVER(PARTITION BY A,B

ORDER BY START_TIME ROWS ROWS BETWEEN 1 FOLLOWING   AND 1 FOLLOWING

2. how does the rows ,rows between (preceding,following,unbounded preceding / following) works

 Thanks,

 Bala

Senior Apprentice

Re: Help required on the query using analytic function

Hi Bala, 

#1: This is not going to reset START_TIME when the first row evaluates to true. It's simply creating a new partition *after* the first row evaluates to true. This would require a kind of recursive approach, that's why there's recursion :-)

#2: You should already know how ROWS works when you try to write queries based on that ;-)

Logically it's similar to a Correclated Subquery:

For each row within the answer set the aggregate will be recalculated based on all rows

between a starting row and an ending row.

Starting and ending row might be fixed or relative to the current row based on the following

keywords:

UNBOUNDED PRECEDING, all rows before the current row -> fixed

UNBOUNDED FOLLOWING, all rows after the current row -> fixed

x PRECEDING, x rows before the current row -> relative

y FOLLOWING, y rows after the current row -> relative

Check the Functions & Operator manual for more details.

Dieter

Enthusiast

Re: Help required on the query using analytic function

Thanks a lot Dieter...

Your answers are simple & clear....

Thanks,

Bala