I have been playing a bit with the running sum over partition and I found out some behaviour I sincerely was not expecting.
Let's make it with a simple example:
sum(coalesce(sales,0)) over (partition by Account_manager order by month ROWS UNBOUNDED PRECEDING) as cummulative
There are 4 account managers that make some sales on month 1.
If the AM1 makes no sales on Month 2 then we have on that month
The cummulative sums happen to be
AM1 250 250
AM2 250 500
AM3 250 500
AM4 250 500
total 1000 1750
What happens is that the partition defined has a combination occuring on month 1 but not on month 2. There is no record for that AM1 and therefore is not included in the sum.
I would like to know if there is any way to keep all the combinations as therefore the overall sums are not correct due to the lack of inclusion of the AM1. Something like scanning all the partitions and have the combinations stable across all the partitions.
I know I could use a partition on year level to get all the combinations and then make it as an LOV doing an outer join to the fact table but I don't think is the optimal soultion nor the cleanest one.
I am sure I am not the first person who realises this behaviour on the 'over partition' when the partition combinations are not available in all the partitions.
It's hard to follow your explanation, but you need to include a missing month in your result set?
Are you on TD13.10?
There's the new EXPAND ON clause to accomplish that.
The current issue is that there is no record with the combination of AM1 and Month2. Therefore, when executing the sum over partition, the Month2 cummulative sum is just summing AM2, AM3 and AM4.
Take into account that for the Month2 the values should be the sum for Month1 and Month2. What I would like is to have the value for the AM1 for the previous month to be incorporated as eventhough he hasn't done any sale on Month2, he did some for Month1 and therefore to be accounted on the overall running sum on Month2.
I don't think the 'EXPAND ON' is what would solve my current issue. I heard something like 'filling the gaps' which potentially would be an option but I haven't read anything about that.
Indeed we are on TD 13.10.
Thanks for your support.
I read a bit about 'EXPAND ON' and I think it does not solve my current issue because:
Where can I find more information about the proMISS application?
Could not get you exactly
SUM(sales) OVER (PARTITION BY AM ORDER BY DT1 ROWS UNBOUNDED PRECEDING) AS CUMMULATIVE
But Above Query gives desired result.
Thanks for your comment.
The table has the following values
sales_rep month value
AM1 1 250
AM2 1 250
AM3 1 250
AM4 1 250
AM2 2 250
AM3 2 250
AM4 2 250
So when doing a running sum, there will be no value for the AM1 for the month 2. Therefore the overall running sum for month 2 will NOT include the value for AM1 for the month 1.
Therefore, the running sum is not 'filling the gap' of unexisting combinations.
Hope this explanation helped a bit your understanding of my current problem.
What's the datatype of "month"?
If you can cast it to a date (preferably 1st of month), you can create a period on-the-fly using
over (partition by account_manager
order by month
rows between 1 following and 1 following)
) as p
This should not need an extra STATS step in explain, because it's the same PARTITION/ORDER as the cummulative sum.
And then you use an
EXPAND ON p AS p2 BY ANCHOR MONTH_BEGIN
FOR PERIOD (DATE '2011-01-01', DATE '2011-05-01').
This will create any missing rows after the first month with data (showing the previous cummulative sum), but none before, e.g. if you got data for february only this will result in:
feb/mar/april, but no jan row.
Thanks indeed for your answer.
It's quite interesting. I was thinking of something similar which would be doing a partition on year level then to know which AM are involved for that year and then expand all the months with zero values. That approach, though, would mean to run twice the scan of the table: one for the year partition with months expanded with zero values and the second one to read the month level records to, later on, sum both of them (zero + value=value).
I would need to check if the situation of not being able to report values for the same year before the 'first' record month is a problem or not from the report perspective.
Again, thanks a lot for sharing your expertise on this problem.
Which would be the approach for another database which is not on 13.10 but 13.01 as the 'EXPAND ON' functionality is not there?
Thanks again and regards,