overlapping minutes between different rows

Database
Highlighted
Enthusiast

overlapping minutes between different rows

If I had a table of customers doing overlapping activities but stored at different session ids, would it be possible to count a pro-rated number of the overlapping time in some way? Would like the "pro-rated" field to be calculated. In reality I don't think we'll see more than 3-4 concurrent sessions at any given time, but would need to generalize this. The partition would be on customer_id and date. 

 

session_idcustomer_iddatestart_timeend_timedurationpro-rated
112019-02-0210:1510:30159.17
212019-02-0210:2010:402014.17
312019-02-0210:2510:3051.67
412019-02-031:151:301515.00
522019-02-031:151:301515.00
622019-02-031:301:451515.00


Session 1: 5 minutes * 1 session + 5 minutes * 1/2 sessions + 5 minutes * 1/3 sessions = 9.17 minutes

Session 2: 5 minutes * 1/2 sessions + 5 minutes * 1/3 sessions + 10 minutes * 1 session = 14.17 minutes

Session 3: 5 minutes * 1/3 sessions = 1.67 minutes

Session 4: 15 minutes * 1 session

Session 5: 15 minutes * 1 session

Session 6: 15 minutes * 1 session


Accepted Solutions
Teradata Employee

Re: overlapping minutes between different rows

Hi phoenix373,

 

I've used the powerfull TD_SEQUENCED_COUNT function here to achieve your need.

 

Datas

create multiset volatile table mvt_datas, no log
( session_id    byteint
, customer_id   byteint
, session_dt    date    format 'yyyy-mm-dd'
, start_time    time(0)
, end_time      time(0)
)
primary index (customer_id, session_dt)
on commit preserve rows;

insert into mvt_datas values (1, 1, date '2019-02-02', time '10:15:00', time '10:30:00');
insert into mvt_datas values (2, 1, date '2019-02-02', time '10:20:00', time '10:40:00');
insert into mvt_datas values (3, 1, date '2019-02-02', time '10:25:00', time '10:30:00');
insert into mvt_datas values (4, 1, date '2019-02-03', time '01:15:00', time '01:30:00');
insert into mvt_datas values (5, 2, date '2019-02-03', time '01:15:00', time '01:30:00');
insert into mvt_datas values (6, 2, date '2019-02-03', time '01:30:00', time '01:45:00');

collect stats column (customer_id, session_dt) on mvt_datas;

Query

with cte_seq_cnt (customer_id, session_dt, ph_session, duration_prr) as
(
select sc.customer_id
     , sc.session_dt
     , sc.ph_session
     , extract(minute from ((end(sc.ph_session) - begin(sc.ph_session)) minute(4))) / cast(sc.cnt as decimal(10,3))
  from table (td_sequenced_count(new variant_type(cte_period.customer_id, cte_period.session_dt),
                                  cte_period.ph_session)
              returns ( customer_id byteint, session_dt  date
                      , cnt         integer
                      , ph_session  period(timestamp(0))
                      )
              hash by customer_id, session_dt
              local order by customer_id, session_dt, ph_session) as sc (customer_id, session_dt, cnt, ph_session)
)
  ,  cte_period (session_id, customer_id, session_dt, start_time, end_time, ph_session) as
(
  select session_id
       , customer_id
       , session_dt
       , start_time
       , end_time
       , period(cast(session_dt as timestamp(0)) + (start_time - time '00:00:00' hour to second),
                cast(session_dt as timestamp(0)) + (end_time   - time '00:00:00' hour to second)) as ph_session
    from mvt_datas
)
  select cpr.session_id
       , cpr.customer_id
       , cpr.session_dt
       , cpr.start_time
       , cpr.end_time
       , extract(minute from ((end(cpr.ph_session) - begin(cpr.ph_session)) minute(4))) as duration
       , sum(csc.duration_prr) as pro_rated
    from cte_seq_cnt as csc
    join cte_period  as cpr  on cpr.customer_id       = csc.customer_id
                            and cpr.session_dt        = csc.session_dt
                            and cpr.ph_session contains csc.ph_session
group by 1, 2, 3, 4, 5, 6
order by 1 asc;
    

session_id  customer_id  session_dt  start_time  end_time  duration  pro_rated
----------  -----------  ----------  ----------  --------  --------  ---------
         1            1  2019-02-02    10:15:00  10:30:00        15      9.167
         2            1  2019-02-02    10:20:00  10:40:00        20     14.167
         3            1  2019-02-02    10:25:00  10:30:00         5      1.667
         4            1  2019-02-03    01:15:00  01:30:00        15     15.000
         5            2  2019-02-03    01:15:00  01:30:00        15     15.000
         6            2  2019-02-03    01:30:00  01:45:00        15     15.000

 

1 ACCEPTED SOLUTION
4 REPLIES 4
Teradata Employee

Re: overlapping minutes between different rows

Hi phoenix373,

 

I've used the powerfull TD_SEQUENCED_COUNT function here to achieve your need.

 

Datas

create multiset volatile table mvt_datas, no log
( session_id    byteint
, customer_id   byteint
, session_dt    date    format 'yyyy-mm-dd'
, start_time    time(0)
, end_time      time(0)
)
primary index (customer_id, session_dt)
on commit preserve rows;

insert into mvt_datas values (1, 1, date '2019-02-02', time '10:15:00', time '10:30:00');
insert into mvt_datas values (2, 1, date '2019-02-02', time '10:20:00', time '10:40:00');
insert into mvt_datas values (3, 1, date '2019-02-02', time '10:25:00', time '10:30:00');
insert into mvt_datas values (4, 1, date '2019-02-03', time '01:15:00', time '01:30:00');
insert into mvt_datas values (5, 2, date '2019-02-03', time '01:15:00', time '01:30:00');
insert into mvt_datas values (6, 2, date '2019-02-03', time '01:30:00', time '01:45:00');

collect stats column (customer_id, session_dt) on mvt_datas;

Query

with cte_seq_cnt (customer_id, session_dt, ph_session, duration_prr) as
(
select sc.customer_id
     , sc.session_dt
     , sc.ph_session
     , extract(minute from ((end(sc.ph_session) - begin(sc.ph_session)) minute(4))) / cast(sc.cnt as decimal(10,3))
  from table (td_sequenced_count(new variant_type(cte_period.customer_id, cte_period.session_dt),
                                  cte_period.ph_session)
              returns ( customer_id byteint, session_dt  date
                      , cnt         integer
                      , ph_session  period(timestamp(0))
                      )
              hash by customer_id, session_dt
              local order by customer_id, session_dt, ph_session) as sc (customer_id, session_dt, cnt, ph_session)
)
  ,  cte_period (session_id, customer_id, session_dt, start_time, end_time, ph_session) as
(
  select session_id
       , customer_id
       , session_dt
       , start_time
       , end_time
       , period(cast(session_dt as timestamp(0)) + (start_time - time '00:00:00' hour to second),
                cast(session_dt as timestamp(0)) + (end_time   - time '00:00:00' hour to second)) as ph_session
    from mvt_datas
)
  select cpr.session_id
       , cpr.customer_id
       , cpr.session_dt
       , cpr.start_time
       , cpr.end_time
       , extract(minute from ((end(cpr.ph_session) - begin(cpr.ph_session)) minute(4))) as duration
       , sum(csc.duration_prr) as pro_rated
    from cte_seq_cnt as csc
    join cte_period  as cpr  on cpr.customer_id       = csc.customer_id
                            and cpr.session_dt        = csc.session_dt
                            and cpr.ph_session contains csc.ph_session
group by 1, 2, 3, 4, 5, 6
order by 1 asc;
    

session_id  customer_id  session_dt  start_time  end_time  duration  pro_rated
----------  -----------  ----------  ----------  --------  --------  ---------
         1            1  2019-02-02    10:15:00  10:30:00        15      9.167
         2            1  2019-02-02    10:20:00  10:40:00        20     14.167
         3            1  2019-02-02    10:25:00  10:30:00         5      1.667
         4            1  2019-02-03    01:15:00  01:30:00        15     15.000
         5            2  2019-02-03    01:15:00  01:30:00        15     15.000
         6            2  2019-02-03    01:30:00  01:45:00        15     15.000

 

Enthusiast

Re: overlapping minutes between different rows

Hey thanks for this, one error that I keep getting is:

SQL Error [3807] [42S02]: [Teradata Database] [TeraJDBC 16.20.00.02] [Error 3807] [SQLState 42S02] Object 'cte_period' does not exist.

 

Is there any particular syntax documentation you can point me to, that helps contruct these combined Common Table Expressions? Thanks.

Enthusiast

Re: overlapping minutes between different rows

For some reason switching the order works, i.e. running cte_period first then cte_seq_cnt:

Any ideas why?

 

with cte_period (session_id, customer_id, session_dt, start_time, end_time, ph_session) as
(
  select session_id
       , customer_id
       , session_dt
       , start_time
       , end_time
       , period(cast(session_dt as timestamp(0)) + (start_time - time '00:00:00' hour to second),
                cast(session_dt as timestamp(0)) + (end_time   - time '00:00:00' hour to second)) as ph_session
    from mvt_datas
)
,
cte_seq_cnt (customer_id, session_dt, ph_session, duration_prr) as
(
select sc.customer_id
     , sc.session_dt
     , sc.ph_session
     , extract(minute from ((end(sc.ph_session) - begin(sc.ph_session)) minute(4))) / cast(sc.cnt as decimal(10,3))
  from table (td_sequenced_count(new variant_type(cte_period.customer_id, cte_period.session_dt),
                                  cte_period.ph_session)
              returns ( customer_id byteint, session_dt  date
                      , cnt         integer
                      , ph_session  period(timestamp(0))
                      )
              hash by customer_id, session_dt
              local order by customer_id, session_dt, ph_session) as sc (customer_id, session_dt, cnt, ph_session)
)  
  select cpr.session_id
       , cpr.customer_id
       , cpr.session_dt
       , cpr.start_time
       , cpr.end_time
       , extract(minute from ((end(cpr.ph_session) - begin(cpr.ph_session)) minute(4))) as duration
       , sum(csc.duration_prr) as pro_rated
    from cte_seq_cnt as csc
    join cte_period  as cpr  on cpr.customer_id       = csc.customer_id
                            and cpr.session_dt        = csc.session_dt
                            and cpr.ph_session contains csc.ph_session
group by 1, 2, 3, 4, 5, 6
order by 1 asc;
Teradata Employee

Re: overlapping minutes between different rows

Prior to 16.xx CTE-s are only evaluated from last to first - meaning the last one can't reference the first one, but the first one can reference the last one.

Post 16.xx CTE-s can be evaluated both ways, but maybe for the table operator where the cte isn't really directly specified in the from clause there are some specific behaviors.

 

I tested the provided code on a 15.10 version of the database.

 

Edit : tested on a 16.20 and indeed I had to reverse the order of the CTE-s.