I am working with a VERY large table (appx 15 Billion rows) and I want to pull back only those rows when conditions based on the previous row are met.
For example, if the table contained the following rows:
Customer Event_Time Event_Type 1 1 A 1 6 B 1 7 A 1 9 C 2 2 A 1 4 B 1 8 D 2 3 B
I would only want to return these two rows:
Customer Event_Time Event_Type 1 4 B 1 6 B
-- in essence, I want to return rows that after being sorted by Customer and Event_Time have the same value in Event_Type. Also, there can be more than two rows consecutive rows with the same Event_Type; in those cases, I need ALL the rows (e.g. if there were another row with Customer=1 and Event_Time=5 and Event_Type=B I would want that row as well).
My current approach is to create a table which contains a portion of the data (roughly 1B rows) in sort order and then use SAS to determine which consecutive rows to keep. I know it would be much faster if I could do the row selection in TD rather than bringing it down and processing in SAS.
RANK is the most straightforward of the OLAP extensions. It not only sorts a result set but also identifies the numeric rank of each row in the result. RANK has only the sort column(s) as its argument and returns an integer that represents the rank of each row in the result. For example, here’s how the RANK function lets you sort employees alphabetically and identify their level of seniority in the company:
SELECT EmployeeName, (HireDate - DATE) AS ServiceDays, RANK( ServiceDays ) as Seniority FROM Employee ORDER BY EmployeeName;
EmployeeName Service Days Seniority Robyn Baker 9931 2 Nick Garrison 9931 2 Kyle McVicker 9408 5 Eva O’Malley 10248 1 Norma Powers 9409 4
And the following example operates on the result of a derived table and join to sort items by category and descending overall rank of revenue:
SELECT Category, Item, Revenue, RANK( Revenue ) AS ItemRank FROM ItemCategory, (SELECT Item, Sum( Sales ) as Revenue FROM DailySales GROUP BY Item) AS ItemSales WHERE ItemCategory.Item = ItemSales.Item ORDER BY Category, ItemRank DESC;
Category Item Revenue ItemRank Hot Cereal Regular Oatmeal 39112.00 4 Hot Cereal Instant Oatmeal 44918.00 3 Hot Cereal Regular COW 59813.00 2 Hot Cereal Instant COW 75411.00 1