Calculating time differences in days - cumulative sums by IDs - Partition over

Database

Calculating time differences in days - cumulative sums by IDs - Partition over

Hi all,

 

I have the following data scenario.

 

MSG_IDLINESTATUS_CHG_TIMELAG_STATUS_CHG_TIMEC TIME_SPENT_SEC  TIME_SPENT_DAYS CUMULATIVE_TIMELAST_EMAIL
14023441248311/1/17 1:09 AM                                          -  0
14023441248331/5/17 9:26 AM1/1/17 1:09 AM4 08:17                 375,420.00                                 4.35                                  4.350
14023441248342/1/17 11:16 AM1/5/17 9:26 AM27 01:50             2,339,400.00                               27.08                               31.421
14023441297911/1/17 3:25 AM                                          -  0
14023441297931/3/17 8:11 AM1/1/17 3:25 AM2 04:46                 189,960.00                                 2.20                                  2.200
14023441297941/3/17 8:11 AM1/3/17 8:11 AM0 00:00                                   -                                        -                                    2.201
14023441299111/1/17 3:30 AM                                          -  0
14023441299131/1/17 8:52 AM1/1/17 3:30 AM0 05:22                   19,320.00                                 0.22                                  0.220
14023441299141/1/17 8:52 AM1/1/17 8:52 AM0 00:00                                   -                                        -                                    0.221
14023441302211/1/17 3:40 AM                                          -  0
14023441302231/4/17 8:24 AM1/1/17 3:40 AM3 04:44                 276,240.00                                 3.20                                  3.200
14023441302242/6/17 6:06 PM1/4/17 8:24 AM33 09:42             2,886,120.00                               33.40                               36.601

 

 

- The original data includes the columns MSG_ID, LINE, and STATUS_CHG_TIME

- I managed to calculate:  LAG_STATUS_CHG_TIME, C, TIME_SPENT_SEC, and TIME_SPENT_DAYS
- I would like to calculate:  CUMULATIVE_TIME and LAST_EMAIL

 

What I managed to calculate are as follows:

 

LAG_STATUS_CHG_TIME:

MIN(STATUS_CHG_TIME) OVER (PARTITION BY MSG_ID  ORDER BY.MSG_ID, LINE, STATUS_CHG_TIME   ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_STATUS_CHG_TIME

 

C:

CAST(STATUS_CHG_TIME AS TIMESTAMP) - CAST(LAG_STATUS_CHG_TIME AS TIMESTAMP) DAY(4) TO MINUTE AS C

 

TIME_SPENT_SEC

(EXTRACT(DAY FROM C) *24*60 *60 + EXTRACT(HOUR FROM C) * 60 *60 + EXTRACT(MINUTE FROM C)*60) AS TIME_SPENT_SEC

 

 TIME_SPENT_DAYS

CAST(TIME_SPENT_SEC as Decimal (18,8))/86400 AS TIME_SPENT_DAYS

 

The TIME_SPENT_DAYS column could also be calculated as below (as seen on https://stackoverflow.com/questions/35633152/time-difference-in-hours-and-seconds-over-a-partition-w...), but I am missing the decimal precision:

, STATUS_CHG_TIME - MIN(STATUS_CHG_TIME) OVER (PARTITION BY MSG_ID ORDER BY MSG_ID, LINE, STATUS_CHG_TIME ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) DAY(4) AS TIME_DAYS_A

 

 *******************************

Given the above, I’d like to calculate the following two columns CUMULATIVE_TIME and LAST_EMAIL. How can I go about doing that? What I am getting is “Ordered analytical functions can’t be nested” as I try to calculate the CUMULATIVE_TIME column. Can this be done?

 

Your help you'd be greatly appreciated.

 

Thanks,

 

Alex


Accepted Solutions
Teradata Employee

Re: Calculating time differences in days - cumulative sums by IDs - Partition over

Ordered Analytic functions can't be "nested" directly, but you can put your existing query in a derived table expression FROM (your-query) AS alias or common table expression WITH cte AS (your-query) SELECT and then apply another layer of Ordered Analytic functions in an outer query.

1 ACCEPTED SOLUTION
4 REPLIES
Teradata Employee

Re: Calculating time differences in days - cumulative sums by IDs - Partition over

Ordered Analytic functions can't be "nested" directly, but you can put your existing query in a derived table expression FROM (your-query) AS alias or common table expression WITH cte AS (your-query) SELECT and then apply another layer of Ordered Analytic functions in an outer query.

Re: Calculating time differences in days - cumulative sums by IDs - Partition over

Hi Fred. Thanks for that. It was actually pretty easy to do after your instructions. The cumulative sum became:

 

select SUM(TIME_SPENT_DAYS) OVER (PARTITION BY DERIVED_TABLE.MSG_ID
ORDER BY DERIVED_TABLE.MSG_ID, DERIVED_TABLE.LINE, DERIVED_TABLE.STATUS_CHG_TIME
ROWS UNBOUNDED PRECEDING ) AS cumul_sum,

from (BIG QUERY) AS DERIVED_TABLE

 

////////////

 

Do you know how I can get the other column calculated? That is, within each MSG_ID, set the max record to 1 and the ealier records to zero. Would that be a case statement with a partition over in it?

 

MSG_ID      LAST_MSG_IS

**bleep**            0

**bleep**            0

**bleep**            1

YYY            0

YYY            0

YYY            1

 

 Your help is again appreciated. Thanks!

 

Alex

 

 

Teradata Employee

Re: Calculating time differences in days - cumulative sums by IDs - Partition over

Yes. Lots of options for the CASE expression.

 

Could use something like "LEAD_STATUS_CHG_TIME" IS NULL /* like LAG but with 1 FOLLOWING instead of 1 PRECEDING */

Or ROW_NUMBER partitioned on MSG_ID ordered timestamp descending = 1

Or don't partition, ordered MSG_ID/timestamp and check "calculated LEAD MSG_ID" IS NULL or <> current MSG_ID

Junior Contributor

Re: Calculating time differences in days - cumulative sums by IDs - Partition over

You should try to use a function with the same PARTITION/ORDER BY as the existing to get a single STATS step in Explain, i.e. Fred's LEAD.

 

 

But you don't need to nest at all, simply calculate the difference between the current row and the first row

FIRST_VALUE(STATUS_CHG_TIME) 
OVER (PARTITION BY MSG_ID 
     ORDER BY LINE, STATUS_CHG_TIME -- no need to repeat MSG_ID as it's already partitioned by it
     ROWS UNOUNDED PRECEDING) 

Btw, is it really possible that LINEs are not based on STATUS_CHG_TIME?

Otherwise you can remove one of them from ORDER.

 

current row - MIN OVER UNOUNDED PRECEDING to get the start time