reset avg calculation based on column value

Database
Enthusiast

reset avg calculation based on column value

I have a table like this:

 

CUST_ID                  purchase_dt    prev_purch_dt         gap      avg_gap  reset_flag
1    543,278,201           2/25/2016                                      0               0                0
2    543,278,201           3/24/2016    2/25/2016                27          13.5              0
3    543,278,201           4/28/2016    3/24/2016                34           20.3             0

4    543,278,201           7/27/2016    4/28/2016                87           37                1

5    543,278,201           9/15/2016    7/27/2016                0                0                0

6    543,278,201           9/21/2016    9/15/2016                6                3                0

 

The column 'gap' is the time in days between the customer's purchases. Avg_gap is the cumulative or running average across all rentals made  by the customer until they encounter the reset flag = 1. Once the reset flag = 1, I want to set the 'gap' and 'avg_gap' to 0 so basically restart the counter.  Is there a way to do that?

 

Thanks

 
6 REPLIES
Junior Contributor

Re: reset avg calculation based on column value

Simply add RESET WHEN reset_flag = 1 to the current ORDER BY in the cumulative average calculation.

Enthusiast

Re: reset avg calculation based on column value

So this is the SQL I have,

 

select

cust_id,

AVG(gap) OVER(PARTITION BY r.cust_id ORDER BY r.purchase_dt ASC ROWS UNBOUNDED PRECEDING ) AS avg_gap,

case when gap > avg_gap then reset_flag = 1 else 0

from table a.

 

So two questions -

1. where would the RESET WHEN clause go?

2. is it an issue that the reset_flag is getting defined AFTER it has been referenced in Avg window function?

Thanks

 

 

 

Highlighted
Junior Contributor

Re: reset avg calculation based on column value

This is not going to work with RESET because there's a recursive definition, reset_flag is based on avg_gap and vice versa.

 

OLAP functions are probably not going to work, you'lly need a recursive query to get the expected result.

Enthusiast

Re: reset avg calculation based on column value

So in this query, how do I incorporate recursive-ness.

 

Junior Contributor

Re: reset avg calculation based on column value

You must code a recursive Common Table Expression (CTE) using WITH RECURSIVE which iterates over the rows per customer and applies your logic. The first problem is finding the next row, it must be easily calculated based on the current row, if both purchase_dt and prev_purch_dt are columns in a table it's simple, but I assume it's calculated. In that case you better materialize the data first and add a ROW_NUMBER.

 

Which columns in your example are from a base table and which are calculated (and how)?

Enthusiast

Re: reset avg calculation based on column value

My table has multiple customers.

Taking a snapshot for a single customer, this is what it looks like:

 

      CUST_ID                  purchase_dt    prev_purch_dt         gap      avg_gap      Std_dev      reset_flag
1    543,278,201           10/22/2009                                      0             0                0                    0
2    543,278,201           02/11/2010    10/22/2009               112         56               79                   0
3    543,278,201           03/11/2010    02/11/2010               28            46              58                    1

4    543,278,201           04/08/2010   03/11/2010               28             42              48                    0

5    543,278,201           05/15/2010   04/08/2010               7              35               44                    0

6    543,278,201           9/21/2016    9/15/2016                 77             42                43                   0

 

Only cust_id and purchase_dt are from a base table. Rest all are calculated. Here's the definition

I have a row_number in the beginning that lists their purchases.

Prev_purchase date - lists the previous purchase date.

gap is the time between two purchases.

reset flag = 1 if gap > avg_gap + std_dev.

When reset flag = 1 the gap column in next row should be reset and the calculations should be restarted for avg_gap and std_dev until they reach a reset_flag = 1 again.

Does that make sense?