Database

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Highlighted
##
##### Creating Group ID from SUM Limit

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-04-2013
01:40 PM

09-04-2013
01:40 PM

I have a list of products and a count corresponding to the quantity sold in a single table. The data is laid out as such:

Product Name QTY_SOLD

Mouse 23

Keyboard 25

Monitor 56

TV 10

Laptop 45

...

I want to create a group ID where groups are broken up if the sum of the quantity sold is greater than 50. We can order by Product Name to get an output similar to the following.

Product Name QTY_SOLD GROUP_NBR

Keyboard 25 1

Laptop 45 1

Monitor 56 2

Mouse 23 3

TV 10 3

I created a case statement to create the output I need but if I want to change the group id cutoff from 50 to say 100 or if i get more products and quantities I have to keep changing the case statement. Is there an easy way to use either recursion or some other method to accomodate this?

UPDATE main

FROM prod_list AS main,

(

SEL PROD_NAME

, QTY_SOLD

, SUM(QTY_SOLD) OVER (ORDER BY PROD_NAME ROWS UNBOUNDED PRECEDING) RUNNING

FROM prod_list

) inr

SET GROUP_NBR =

CASE

WHEN RUNNING < 50 THEN 1

WHEN RUNNING > 50 AND RUNNING < 100 THEN 2

WHEN RUNNING > 100 AND RUNNING < 150 THEN 3

WHEN RUNNING > 150 AND RUNNING < 200 THEN 4

WHEN RUNNING > 200 AND RUNNING < 250 THEN 5

ELSE 6

END

WHERE main.PROD_NAME = inr.PROD_NAME

;

7 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-04-2013
01:46 PM

09-04-2013
01:46 PM

I meant to say, I want to create a group ID where groups are created if the ROLLING sum of the quantity sold is greater than 50.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-05-2013
08:40 AM

09-05-2013
08:40 AM

I was thinking I may be able to get away with simply distributing the records evenly based on an average or other distribution key. If I had data such as below I would want to distribute them into buckets somewhat evenly.

PROD A - 50

PROD B - 55

PROD C - 1000

PROD D - 25

PROD E - 32

PROD F - 800

One possible result

GRP 1 - (PROD A & D)

GRP 2 - (PROD B & E)

GRP 3 - (PROD C)

GRP 4 - (PROD F)

Is that possible?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-05-2013
03:06 PM

09-05-2013
03:06 PM

Change your running sum to order on QtySold.

Join that result to SELECT SUM(QtySold) AS GrandTotal FROM Prod_List.

Replace your CASE expression with (running/GrandTotal) DIV <#of buckets you want>.

This distributes the products into a best approximation equal height histogram by their percentage contribution to the total quantity. If you have very large outliers then you might have missing buckets because the big one will cover multiple percentage buckets.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-05-2013
09:29 PM

09-05-2013
09:29 PM

I tried to construct the query you recommended but when i ran it with over a hundred products it created almost as many buckets and didnt stick to the number of buckets i had wanted. I created another query using MOD and it gave me almost what i was looking for but didnt distribute as evenly as I had hoped for.

UPDATE main FROM prod_list AS main,

(

SEL

PROD_NAME

, QTY_SOLD

, SUM(QTY_SOLD) OVER (ORDER BY NUMOF ROWS UNBOUNDED PRECEDING) RUNNING

FROM prod_list

) inr

SET GROUP_NBR = RUNNING MOD <# of buckets>

WHERE main.PROD_NAME = inr.PROD_NAME ;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-06-2013
05:26 AM

09-06-2013
05:26 AM

This is missing the join to and division by the grand total. That is necessary in order to make the running sum a percentage contribution. And I left off the multiplication of that result by 100 before the division by the number of buckets.

In your sample you order by NUMOF? rather than QTY_SOLD.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-09-2013
02:02 PM

09-09-2013
02:02 PM

Is this the query you mean? I did the whole thing over so that I wouldnt have to do an insert and then an update. This still gives me some weird results what am I missing?

WITH PROD_CTE (PROD_NAME,QTY_SOLD) AS

(

SELECT

PROD_NAME

, COUNT(DISTINCT PROD_ID) AS QTY_SOLD

FROM all_prods

GROUP BY 1

)

SELECT

PROD_NAME

, QTY_SOLD

, CUMLTV_SUM

, TTL

, (((RUNNING_SUM / TTL ) * 100 ) / 10) AS BUCKET

FROM (

SELECT

PROD_NAME

, QTY_SOLD

, SUM(QTY_SOLD) OVER (ORDER BY QTY_SOLD ROWS UNBOUNDED PRECEDING) AS RUNNING_SUM

, TTL

FROM PROD_CTE

JOIN (

SELECT SUM(QTY_SOLD) TTL

FROM PROD_CTE

)sm ON 1=1

)v

;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

09-24-2013
11:14 AM

09-24-2013
11:14 AM

I ended up using a suggestion from the guys over at stack overflow.

First create a table that has the group id and the upper and lower limits of each bucket.

-- create the first entry for the recursive query

INSERT TMP_WORK_DB.GRP_NBRS VALUES (0,1,0,2000000);

INSERT TMP_WORK_DB.GRP_NBRS (GRP_NBR,LOWER_LIMIT, UPPER_LIMIT)

WITH RECURSIVE GRP_RECRSV (GRP_NBR, LOWER_LIMIT, UPPER_LIMIT)

AS (

SELECT

1 AS GRP_NBR

, LOWER_LIMIT

, UPPER_LIMIT

FROM TMP_WORK_DB.GRP_NBRS

UNION ALL

SELECT

GRP_NBR + 1

, LOWER_LIMIT + 2000000 -- set the interval to 2 million

, UPPER_LIMIT + 2000000 -- can be adjusted as needed

FROM GRP_RECRSV

WHERE GRP_NBR < 120 -- needed a limit so that it would not be endless

)

SELECT * FROM GRP_RECRSV

;

Then use a simply WHERE BETWEEN on the data to determine which GROUP ID to assign.

-- delete the first entry because it was duplicated

DELETE FROM TMP_WORK_DB.GRP_NBRS WHERE GRP_NBR = 0;

-- set grp nbr using the limits table

INSERT TMP_WORK_DB.PROD_LIST_GRP

WITH NUMOFPRODS (PROD_NAME,QTY,RUNNING) AS

(

SELECT

PROD_NAME

, COUNT(DISTINCT PROD_ID) AS QTY

, SUM(QTY) OVER (ORDER BY QTY ROWS UNBOUNDED PRECEDING) RUNNING

FROM TMP_WORK_DB.PROD_LIST

GROUP BY 1

)

SELECT

PROD_NAME

, QTY

, RUNNING

, GRP_NBR

FROM NUMOFPRODS a

JOIN TMP_WORK_DB.GRP_NBRS b ON RUNNING BETWEEN LOWER_LIMIT AND UPPER_LIMIT

;