I am trying to get the top 20% of a larger grouping of items;
I have 100 different individuals (IND_VAR), each with 1000 distinct items (ITEM_VAR) with amounts (ITEM_AMT), and I have summarized the amounts by individual (IND_SUM) and created a 20% of the IND_SUM variable (SUM_20). I would like to pull out only the specific items (ITEM_VAR) that make up the top 20% of the total by individual.
I have tried the top 20 PERCENT from ... order by.. and I either cannot get the syntax right, or it is not the best choice.
I don't get what you actually want, can you show some example data?
TOP PERCENT is usually replaced by PERCENT_RANK...
I need to see just the items that make up the top 20% for each customer.
For example, this customer has 280 items, which total $239,324.33, but there are only 6 items that make up 20% of the total ($47,864.87). I want those items.
IND_VAR ITEM_VAR ITEM_AMT IND_SUM SUM_20
12345 2515044 $9,048.00 $239,324.33 $47,864.87
12345 2508246 $8,847.35 $239,324.33 $47,864.87
12345 2496178 $8,780.27 $239,324.33 $47,864.87
12345 2502322 $8,105.68 $239,324.33 $47,864.87
12345 2495645 $7,002.18 $239,324.33 $47,864.87
12345 2491301 $6,228.45 $239,324.33 $47,864.87
Thank you for the direction on Percent_rank, that worked. One more question though, does Percent_rank have a natural decimal point? .02 worked where .20 brings in too much for the top 20%
,PERCENT_RANK( ) OVER (PARTITION BY IND_VAR ORDER BY ITEM_AMT DESC)
Qualify Per_Val <= 0.02
PERCENT_RANK will not work in your case, it returns a ranking based on the amount, but you need it based on the sum of the amounts, a Cumulative Sum:
OVER (PARTITION BY IND_VAR
ORDER BY ITEM_AMT DESC
rows unbounded preceding) <= SUM_20
When I double checked, you are right, it was only pulling in the top 20% of the number of items, not amounts. What you provided pulled in by the total amount. Thank you very much for your help. This was my first time posting but it was extremely helpful.