Does Teradata 12 support Conditional Merge into statement?
Example: I have tables t1(col1 int, col 2 char(20)) and t2(col1 int, col 2 char(20)). These 2 tables are joined on Col1 and I want to upsert/merge t2 from t1 - that is t1 is the source and t2 is the target.
But instead of blindly updating the row in the WHEN MATCHED clause I'd like to compare if the value in col2 is different and update t2 ONLY IF the value in COL2 differs. As you can imagine if T2 is very large you would not want to update every row that is matched even if nothing has changed - if only a small percentage of rows have actually changed it makes sense to update just those rows. This is a typical incremental update scenario and I am surprised Teradata does not support it.
The below is valid syntax in Oracle and I get an error in Teradata. I read through the SQL documentation and this sort of conditional update is not mentioned anywhere. It also appears that Teradata does not support DELETE in the Merge statement.
merge into T2 using T1 on T1.col1 = T2.col1 when matched then update set col2 = T1.col2 where col2 <> T1.col2 when not matched then insert(col1, col2) values(T2.col1, T2.col2);
Instead of trying to rule out the unneeded updates in the Update part of the statement, you could use a derived table based on T1 witout the rows you want to ignore. So:
merge into T2 using (Select T1.Col1, T1.Col2 From T1 Left Join T2 On T1.Col1 = T2.Col1 Where T1.Col2 <> T2.Col2 Or T2.Col1 Is Null) D1 on D1.col1 = T2.col1 when matched then update set col2 = D1.col2 when not matched then insert(col1, col2) values(D1.col1, D1.col2);
If there are very few rows which you do not want to update, it is probably more efficient to update anyway. You may need to look at data demographics.
Can you please clarify your statement? Let's assume my source has a million records and the target has a billion rows. If 500K of those are inserts and 500k are existing in the target and assuming only 10,000 of those rows that exist in the target need to be updated (because only those rows have something changed on them that I care about) I don't necessarily want to update all 500K rows. It would be better to update the 10,000 that really changed from my perspective. Are you saying that Teradata automatically does that? How does it know what a change is? Does it compare the entire row and update only if something on the row has actually changed?
I would think there is some performance benefit because Teradata does not have to keep track (before and after row images) of the unnecessary updates (the 490,000 of them).
Teradata does not need to compare all the columns, only the "candidate" columns, i.e., the columns that are the targets of SET expressions in the update clause. The same technique is used for simple updates as well as merges.
I'll grant that there could be some savings of CPU path if the Teradata merge statement syntax allowed a supplementary condition on the update clause, because then the building of the candidate row could be skipped, but the disk I/O cost would be the same.