I am trying to solve a business case which requires us to find out what is the maximum continuous number of months a specific account fee has been paid on an account.
The following tables describe the problem, first three column is present in the data and I want to create the fourth column :
Txn_Month | Acc_No | Fee_Flag | Desired_Flag
Jan-17 123 1 1
Feb-17 123 1 2
Mar-17 123 0 0
Apr-17 123 0 0
May-17 123 0 0
Jun-17 123 0 0
Jul-17 123 1 1
Aug-17 123 1 2
Sep-17 123 1 3
Oct-17 123 1 4
Nov-17 123 0 0
Dec-17 123 1 1
Jan-18 123 0 0
Feb-18 123 0 0
As my intuition/work experience says that it could be solved through window functions, where I partition by Acc_No and Order by Txn_Month, and then try to calculate the flags using LEAD function [min(flag) over (partition by acc_no order by Txn_Month rows following 1 and following 1)] to check whether the next row value is 1 or 0.
I am getting stuck afterwards that how could I get the cumulative sum of flags and how get the individual sums of these islands?
Does this fall under gaps and islands problem?
Hope the query is clear. Please let me know if I could expand on any of the bits.
Solved! Go to Solution.
Thanks ToddAWalter. This solves my problem using this code :
SyntaxEditor Code Snippet
SELECT EXTRACT_DATE,BRANCH_NO,ACCOUNT_NO,LEDGER_BALANCE,FLAG, SUM(FLAG) OVER(PARTITION BY BRANCH_NO,ACCOUNT_NO ORDER BY EXTRACT_DATE RESET WHEN FLAG=0 ROWS UNBOUNDED PRECEDING) "Range" FROM BSP_RCA.NK_Test A ORDER BY EXTRACT_DATE
RESET WHEN is a nice extension, as long as if it's used only once and there's no other OLAP-function (the optimizer is not very smart combining them into common steps).
Your original approach was quite close.
It needs two nested OLAP-steps, #1 to define those dynamic partitions and #2 to do the actual calculation:
SELECT EXTRACT_DATE,BRANCH_NO,ACCOUNT_NO,LEDGER_BALANCE,FLAG, SUM(FLAG) OVER(PARTITION BY BRANCH_NO,ACCOUNT_NO, newGroup ORDER BY EXTRACT_DATE ROWS UNBOUNDED PRECEDING) "Range" FROM ( SELECT EXTRACT_DATE,BRANCH_NO,ACCOUNT_NO,LEDGER_BALANCE,FLAG, SUM(case when flag = 0 then 1 else 0 end) -- assign the same number to all rows from a group
OVER(PARTITION BY BRANCH_NO,ACCOUNT_NO ORDER BY EXTRACT_DATE ROWS UNBOUNDED PRECEDING) newGroup FROM BSP_RCA.NK_Test
) A ORDER BY EXTRACT_DATE
This is what RESET WHEN is doing in the background and will result in exactly the same plan.
Now if you got additional OLAP calculations you're usually able to add them to one of those Selects.