Best approach in to compute row level checksum and aggregation of row level checksums
We are offloading data from Teradata systems to SQL Server and we need to reconcile the data and verify if the data in Source and Target systems are in sync.To achieve this we are planning to generate the checksum of every row and the aggregate the checksum in the source system(Teradata) and then perform the same for the target system(SQL SERVER) and verify if both the values are same. We are trying to do aggregation of the checksum of all the rows to avoid the need to sort the records and then comparing on a row-row basis which will mean performance overhead.
I would like to know what would be the best HASH algorithm(MD5,SHA1) to get row level checksum considering the number of records will be huge(billions)? What is the best function/way to combine the checksum of the individual rows?
Please let me know if you need any other information?