Need assistance with creating a moving sum for past 120 days

Teradata Applications

Need assistance with creating a moving sum for past 120 days

Hello,

 

I have a table which has the following columns metric_id,sale_dt,rvnu_amt.

A metric can/cannot have data for the past 120 days.

Am trying to create a rolling metric rlp_rvnu_amt which will have the total sum of revenue for the past 120 days.

I tried using the sum(rvnu_amt)  over(order by sale_dt rows 120 preceding) but it just sums the last 120 rows and not the 120 days as data is not available for some days in the past 120 days.

 

Thanks

Nick

 

10 REPLIES 10
Teradata Employee

Re: Need assistance with creating a moving sum for past 120 days

One approach is to convert your data to "dense" representation with rows for each day (perhaps "dummy" rows) and then summarize. Teradata offers the EXPAND ON clause (documented in the DML manual), or you can use something like "calendar_table LEFT JOIN metric_table" to generate rows for the "missing" dates.

Re: Need assistance with creating a moving sum for past 120 days

Thank you Fred.I will check out the EXPAND ON clause.Will keep you posted.

 

Thanks

Nick

Ambassador

Re: Need assistance with creating a moving sum for past 120 days

How many many metrics & rows per metric_id exist?

Do you need a single Select or can you utilize a Volatile Table?

Re: Need assistance with creating a moving sum for past 120 days

Hi,

I have a similar situation where I have to create a moving sum for the past 6 months. My data looks like

1. A B 20-Jan-18 20
2. A B 20-Mar-18 45
3. A B 10-Apr-18 15
4. A B 21-May-18 30
5. A B 30-Jul-18 10
6. A B 15-Aug-18 25

And the expected result is

1. A B 20-Jan-18 20 20 Sum of row1
2. A B 20-Mar-18 45 65 Sum of row1+2
3. A B 10-Apr-18 15 80 Sum of row1+2+3
4. A B 21-May-18 30 110 Sum of row1+2+3+4
5. A B 30-Jul-18 10 100 Sum of row2+3+4+5 (as row1 is > 6 months in the past)
6. A B 15-Aug-18 25 125 Sum of row2+3+4+5+6

 

I tried to use the solution proposed in the same thread by inserting dummy records for dates where there is no record and then using ROWS BETWEEN 181 PRECEDING AND CURRENT ROW

But there may be situations where there are multiple records on the same day which means that choosing the last 181 rows will lead to the earliest record getting dropped. 

I have checked a lot of cases on this forum and others but can't find a solution for this moving average where the window size is not constant. Please help.  

Teradata Employee

Re: Need assistance with creating a moving sum for past 120 days

Hi amolvsh,

 

 

But there may be situations where there are multiple records on the same day which means that choosing the last 181 rows will lead to the earliest record getting dropped. 


Aggregate those data for you to have one record per day, then expand + 181 rows should work.

Or you can go for the joining solution, but care if using large tables :

create multiset volatile table mvt_data, no log
( c1    char(1)
, c2    date
, c3    integer
)
primary index (c1)
on commit preserve rows
;

insert into mvt_data values ('A', date '2018-01-20', 20);
insert into mvt_data values ('A', date '2018-03-20', 45);
insert into mvt_data values ('A', date '2018-04-10', 15);
insert into mvt_data values ('A', date '2018-05-21', 30);
insert into mvt_data values ('A', date '2018-07-30', 10);
insert into mvt_data values ('A', date '2018-08-15', 25);

  select t1.c1, t1.c2, t1.c3
       , sum(t2.c3)
    from mvt_data as t1
    join mvt_data as t2  on t2.c1  = t1.c1
                        and t2.c2 >= t1.c2 - interval '6' month
                        and t2.c2 <= t1.c2
group by t1.c1, t1.c2, t1.c3;

 c1 c2          c3 c3 
 -- ----------  -- --- 
 A  2018-01-20  20  20
 A  2018-03-20  45  65
 A  2018-04-10  15  80
 A  2018-05-21  30 110
 A  2018-07-30  10 100
 A  2018-08-15  25 125

Re: Need assistance with creating a moving sum for past 120 days

Hi Waldar,

Thanks for the reply. I think I wasn't clear enough in the first instance

Defct# Dim1 Dim2 Defectdate Downtime
1. 1001 A B 01-Feb-18 20
2. 1002 A B 20-Mar-18 45
3. 1003 A B 10-Apr-18 15
4. 1004 A B 21-May-18 30
5. 1005 A B 21-May-18 15
6. 1006 A B 30-Jul-18 10
7. 1007 A B 15-Aug-18 25

Expected
Defct# Dim1 Dim2 Defectdate Downtime Cumulative
1. 1001 A B 01-Feb-18 20 20 Sum of 1
2. 1002 A B 20-Mar-18 45 65 Sum of 1,2
3. 1003 A B 10-Apr-18 15 80 Sum of 1,2,3
4. 1004 A B 21-May-18 30 110 Sum of 1,2,3,4
5. 1005 A B 21-May-18 15 125 Sum of 1,2,3,4,5
6. 1006 A B 30-Jul-18 10 135 Sum of 1,2,3,4,5,6
7. 1007 A B 15-Aug-18 25 140 Sum of 2,3,4,5,6,7

 

As mentioned by you and other threads, I have filled in the gaps in the dates with dummy data. But then when I used ROWS BETWEEN 181 PRECEDING AND CURRENT ROW, I get 115 as cumulative for row6 as the data for row 1 doesn't get included. This is because of multiple records on 21st May. I need to retain the defect ID in the final table so can't aggregate at the date level. If I can get 115 for row6 as cumulative, my problem is solved. Of course, there could be 3-4 instances of the situation on 21st May which means that even genuine records on 2nd or 3rd Feb will get excluded from the cumulative on 30th Jul.

Regards,

Amol

Highlighted
Teradata Employee

Re: Need assistance with creating a moving sum for past 120 days

Hi Amol,

 

Here are the new datas :

 

create multiset volatile table mvt_data, no log
( Defect#       integer
, Dim1          char(1)
, Dim2          char(1)
, DefectDate    date format 'yyyy-mm-dd'
, DownTime      smallint
)
primary index (Dim1, Dim2)
on commit preserve rows;

insert into mvt_data values (1001, 'A', 'B', date '2018-02-01', 20);
insert into mvt_data values (1002, 'A', 'B', date '2018-03-20', 45);
insert into mvt_data values (1003, 'A', 'B', date '2018-04-10', 15);
insert into mvt_data values (1004, 'A', 'B', date '2018-05-21', 30);
insert into mvt_data values (1005, 'A', 'B', date '2018-05-21', 15);
insert into mvt_data values (1006, 'A', 'B', date '2018-07-30', 10);
insert into mvt_data values (1007, 'A', 'B', date '2018-08-15', 25);

Query version using join logic :

 

 

  select t1.Defect#, t1.Dim1, t1.Dim2, t1.DefectDate
       , sum(t2.DownTime)
    from mvt_data as t1
    join mvt_data as t2  on t2.Defect#    <= t1.Defect#
                        and t2.Dim1        = t1.Dim1
                        and t2.Dim2        = t1.Dim2
                        and t2.DefectDate >= t1.DefectDate - interval '6' month
                        and t2.DefectDate <= t1.DefectDate
where t1.Dim1 = 'A'
and t1.Dim2 = 'B' group by t1.Defect#, t1.Dim1, t1.Dim2, t1.DefectDate; Defect# Dim1 Dim2 DefectDate DownTime ------- ---- ---- ---------- -------- 1001 A B 2018-02-01 20 1002 A B 2018-03-20 65 1003 A B 2018-04-10 80 1004 A B 2018-05-21 110 1005 A B 2018-05-21 125 1006 A B 2018-07-30 135 1007 A B 2018-08-15 140

 

I agree the version with filling the gaps is harder to apprehend considering all your conditions.

 

Re: Need assistance with creating a moving sum for past 120 days

Thanks Waldar. I will try this also.

Regards,

Amol.

Teradata Employee

Re: Need assistance with creating a moving sum for past 120 days

Also I made a mistake here, don't use +/- interval 'x' month unless you're sure to be in the first day of the month, it can produce errors with end of months like this :

 

select date '2019-05-31' - interval '1' month;

Use add_months function instead (or 181 days but I don't like that much) :

 

select add_months(date '2019-05-31', -1);