I have a table with store #, item # and the sales with billions of records. I need to rank by the sales to find out the top 100 selling items. I can use the RANK function and find it out. However, the issue is, user wants top 100 items without skipping a rank.
For example, if item A and B both sale $100 then teradata would give same rank to them, say rank 5. Say item C does sale of $99.99, it would be rank 7. There won't be rank 6 as there are 2 entries with rank 5. They don't want that. They want item C to be rank 6.
How do I achieve that ? Is there any function in teradata that does it ?
I don't think that will work because users want top 100 rank values. But for them ranking means not like teradata ranking. They don't want to skip a rank. With row_number, I will get unique number count, but how to identify what are top 100.
For example, as per RANK and ROW_NUMBER we will get output like below -
But they want like -
So not sure, how do I go till item = zzz (and not stop at xxx) to get the required results.
If I understand correctly, you just want top 100 values according to the sales so you can put a limit using QUALIFY clause...
SELECT STORE, ITEM, SALES, ROW_NUMBER() OVER(PARTITION BY SALES ORDER BY SALES DESC) RN
QUALIFY RN <= 100
You could try something like:
FROM YOUR_TABLE a,
( SELECT c.Sales ,
row_number() OVER (ORDER BY c.Sales DESC) The_Rank
FROM ( SELECT DISTINCT Sales FROM YOUR_TABLE ) c
QUALIFY The_Rank <= 100
WHERE a.Sales = b.Sales
ORDER BY b.The_Rank;
First, thank you both for your help. The solution provided by Qaisar would fell short a bit, but the solution provided by Carlos would work perfectly. Thanks again guys.
In TD13.10 there's an example "windowed aggregate UDF" in C and it's a DENSE_RANK, which is what you're looking for.
You might also read http://developer.teradata.com/node/8048 for different solutions.
Of course it's a huge overhead to run OLAP functions on billions of rows, you should add a quite selective WHERE condition to filter low values.