How to get historical distinct customer counts partition by department_id and order by order_date?

Analytics
Enthusiast

How to get historical distinct customer counts partition by department_id and order by order_date?

Hi,

I have a transaction table like below. I'd like to generate a new column which count the historical distinct customer IDs for each department and order date (including the customer_id on the current order date) . At the bottom, I manually created the results.  Could you advise how I can achieve that?

 

Thanks in advance!

 

Wayne

 

Original table:

Department_IDOrder_DateCustomer_ID
11/15/201112
17/15/201123
14/15/201212
19/15/201212
28/15/201234
210/15/201345

 

SQL code to generate the original table:

 

CREATE VOLATILE TABLE TableA
(
department_id integer NOT NULL,
order_date date NOT NULL,
customer_id integer NOT NULL
)
PRIMARY INDEX
(
department_id
)
ON COMMIT PRESERVE ROWS;

INSERT INTO TableA VALUES(1, '2011-01-15', 12);
INSERT INTO TableA VALUES(1, '2011-07-15', 23);
INSERT INTO TableA VALUES(1, '2012-04-15', 12);
INSERT INTO TableA VALUES(1, '2012-09-15', 12);
INSERT INTO TableA VALUES(2, '2012-08-15', 34);
INSERT INTO TableA VALUES(2, '2013-10-15', 45);

SELECT
*
FROM TableA
ORDER BY
department_id,
order_date
;

 

 

Here is the final table I'd like to obtain:

Department_IDOrder_DateCustomer_IDHistorical_Uniq_Customer
11/15/2011121
17/15/2011232
14/15/2012122
19/15/2012122
28/15/2012341
210/15/2013452

 

 

 

 

 

 

 

 

 

Tags (3)

Accepted Solutions
Junior Contributor

Re: How to get historical distinct customer counts partition by department_id and order by order_dat

Strange, I added an answer and did edit it and now it seem to be gone :)

 

In addition to Dave's answers, using two OLAP-steps:

 

 

-- emulating a COUNT(DISTINCT customer_id) in two steps: 
SELECT department_id, order_date, customer_id, first_order_flag,
Count(first_order_flag) Over (PARTITION BY department_id ORDER BY order_date ROWS Unbounded Preceding) FROM ( SELECT department_id, order_date, customer_id, -- determine the first order of a customer CASE WHEN Min(order_date) Over (PARTITION BY department_id, customer_id) = order_date THEN 1 END AS first_order_flag -- or -- CASE WHEN Row_Number() Over (PARTITION BY department_id, customer_id ORDER BY order_date)= 1 -- THEN 1 -- END AS first_order_flag FROM TableA ) AS dt

 

1 ACCEPTED SOLUTION
6 REPLIES
Apprentice

Re: How to get historical distinct customer counts partition by department_id and order by order_dat

Hi Wayne,

 

Do your sample data and expected result match up?

 

In your expected results you show (for instance):

17/15/2011232

But in your data I can only see one customer id value (#23) in department #1 on date 7/15/2011.

Where does the value of '2' come from?

 

What you want to do can almost certainly be done but I'm not sure of your logic.

 

Cheers,

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Enthusiast

Re: How to get historical distinct customer counts partition by department_id and order by order_dat

Hi Dave,

 

On 7/15/2011, for department #1, there is an order from customer #23 and historically before 7/15/2011 (on 1/15/2011) there was another order from customer #12. Since customer #23 is different from customer #12, so by 7/15/2011, for department #1, there are two distinct customers ordered. This is how the "2" come from on the row where the departement ID is "1" and order date is "7/15/2011". 

 

Now looking at the third row, for department #1, on 4/15/2012 (notice here that the order date is acsending) , there is another order from customer #12. Since customer #12 already ordered before (on 1/15/2011),  this order does not count from a new customer by 4/15/2012. So on the "Historical_Uniq_Customer" filed, the value is still "2". In another word, by 4/15/2012, for department #1, there are still two distinct customers ordered.

 

Hope this clarifes the problem. Look forward to any solutions!

 

Thanks!

 

Wayne

Junior Contributor

Re: How to get historical distinct customer counts partition by department_id and order by order_dat

As OLAP functions don't allow DISTINCT you need a two-step approach lke this:

 

SELECT
   department_id,
   order_date,
   customer_id,
-- emulating Count(DISTINCT customer_id) Count(first_order_flag) Over (PARTITION BY department_id ORDER BY order_date ROWS Unbounded Preceding) FROM ( SELECT department_id, order_date, customer_id,
-- determine the first order of a customer CASE WHEN Min(order_date) Over (PARTITION BY department_id, customer_id) = order_date
THEN 1
END AS first_order_flag -- or -- CASE WHEN Row_Number() Over (PARTITION BY department_id, customer_id ORDER BY order_date)= 1
-- THEN 1
-- END AS first_order_flag FROM TableA ) AS dt

 

 

Apprentice

Re: How to get historical distinct customer counts partition by department_id and order by order_dat

Hi Wayne,

 

Thanks for that clarification - now (I think!) I understand.

 

Here are two alternatives which both seem to work for the given sample data. They are really just different ways of coding it, take your pick. There are possibly other ways as well.

 

1) Use a derived table

SELECT a.*
   ,SUM(CASE
        WHEN a.order_dt = dt1.first_order THEN 1
	ELSE 0
	END) OVER (PARTITION BY a.department_id ORDER BY a.order_dt, a.customer_id
		   ROWS UNBOUNDED PRECEDING) AS Historical_Uniq_Customer_calc
FROM vt1 AS a
INNER JOIN (SELECT department_id,customer_id,MIN(order_dt) AS first_order
            FROM vt1
	    GROUP BY 1,2) AS dt1
  ON a.department_id = dt1.department_id
    AND a.customer_id = dt1.customer_id;

2) Use a correlated, scalar sub-query

SELECT a.*
   ,SUM(CASE
        WHEN a.order_dt = (SELECT MIN(order_dt) AS first_order FROM vt1 WHERE department_id = a.department_id AND customer_id = a.customer_id) THEN 1
	ELSE 0
	END) OVER (PARTITION BY a.department_id ORDER BY a.order_dt, a.customer_id
		  ROWS UNBOUNDED PRECEDING) AS Historical_Uniq_Customer_calc
FROM vt1 AS a;

HTH

Dave

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Junior Contributor

Re: How to get historical distinct customer counts partition by department_id and order by order_dat

Strange, I added an answer and did edit it and now it seem to be gone :)

 

In addition to Dave's answers, using two OLAP-steps:

 

 

-- emulating a COUNT(DISTINCT customer_id) in two steps: 
SELECT department_id, order_date, customer_id, first_order_flag,
Count(first_order_flag) Over (PARTITION BY department_id ORDER BY order_date ROWS Unbounded Preceding) FROM ( SELECT department_id, order_date, customer_id, -- determine the first order of a customer CASE WHEN Min(order_date) Over (PARTITION BY department_id, customer_id) = order_date THEN 1 END AS first_order_flag -- or -- CASE WHEN Row_Number() Over (PARTITION BY department_id, customer_id ORDER BY order_date)= 1 -- THEN 1 -- END AS first_order_flag FROM TableA ) AS dt

 

Highlighted
Enthusiast

Re: How to get historical distinct customer counts partition by department_id and order by order_dat

Thanks Dave and dnoeth! 

I got it, the key is to identify whether the current order is the first order of a customer!