I have a requirement where I have to generate a surrogate ID for every row in the FACT table. Previously we used to use ROW_NUMBER() for the same. But the data volume has gone up in the recent requirements upto 100 mill +. The Queries are taking very long time but compartively very less without the ROW_NUMBER. What would be the best way to do so ? We dont want to use Identity columns ! Please help.
A similar thread statred a few days ago:
Which part of the process is slow?
Checking for the MAX value or the calculating using ROW_NUMBER? Did you check DBQL?
Can you share how you create the SK?
Estimated Row count = 200 million
I guess its the calculation of the ROW_NUMBER() part because I tried without the max part but of no help.
The query is something like this: No group by, No order by just a simple move with the ROW_NUMBER calc
ROW_NUMBER() over (ORDER BY AccounT_ID),
I had a look into the DBQL, but could you please help me with what exactly to look for ?
Try the solution given in the last post of following thread:
Also, how is the fact table populated? Like is it populated from single source? Or does it have multiple sources populating data into it?
Its loaded from a single source(TD source table).
I tried with "SUM(1) OVER(ROWS UNBOUNDED PRECEDING).." method as mentioned above, but did not find much difference.
OLAP functions redistribute the rows from spool 1 to spool 2, so all rows are kept in spool twice.
If you have large rows you might try to calculate the sequence using the PK columns only and then join back. This greatly reduces spool sizes and the join back might be quite fast if it's joining on the PI.