I have two tables. I need to join them on a field which is not same but similar. e.g Table1.ColA = '12345', '67890',... and Table2.ColA = '12345/1', '12345/2', '12345/3', '67890/xxx', '67890/yyyy', ...and so on.
So what I need to do is to join those two tables on Table1.ColA like Table2.ColA and update another column in Table2. So after join two tables, in Table2 ColC should be updated from values from Table1 ColC according to ColA values. (and also which join type is better to use here?, inner join or left join?)
ColA ColB ColC
12345 10 12
67890 10 24
ColA ColB ColC
12345/1 10 0 --> 12
12345/2 10 0 --> 12
12345/123 10 0 --> 12
67890/12 10 0 --> 24
67890/5478 10 0 --> 24
The code that I tried but did not work: (I just tried to see the data before update it but I think my join condition, 'like' statement is not correct.) So can anyone assist me here to join these two tables and write correct syntax? Thanks!!
select * from Table1 T1
left join Table2 T2
on ','||T1.ColA||',' like '%,'||TRIM(T2.ColA)||',%'
and T1.ColB = T2.ColB
where T2.ColC = 0
While looking at your data, 12345 in T1 has three 12345/1,12345 /2,12345/123, So in either left or inner you will get three values against a single value in table1, which one to use? I assume this is a sample data. Also can you please provide DDL of this table.
The use of string functions in join condition will lead the join to Product join. We will have to optimize it in some other way.
From the sample data you provide and join , I can see that you need to extract the string to match, like substr. This is going to be expensive. Just as a starting point you can think this just as a sample script. You can modify as per your reqt.
In order to join,you can use POSITION function to match the partial Strings in the two column
select * from Table1 T1 left join Table2 T2
on (T1.ColB = T2.ColB AND POSITION(TRIM(T1.ColA) IN TRIM(T2.ColA))>0)
where T2.ColC = 0;
There are other ways to accomplish the same as well For example:use of substring functions
most of these solutions will "work", but they all have one fatal flaw, they force a full table scan of T1, the column that you are using a function to evaluate. depending upon the size of the table, you would probably be better off if you created another column and updated it to "...position(trim(t1.ColA)", collected stats on it and used the new column in the join.
using any kind of function in the join clause will dramatically increase the spool used and performance will suffer as well.
Agree with you, using a function, especially a string function will not only increase spool and full table scan, but also will cause the optimizer to use a product join strategy to resolve this join.
I would suggest to create a column with this function in a derived or volatile table and then use that table in join.
The optimizer will not be forced to use a Product Join because there's another join condition T1.ColB = T2.ColB, thus the LIKE or SUBSTRING might be evaluated during the join.
I would change the LIKE to a SUBSTRING/POSITION, which might enable a direct join on both colA and colB:
T1.ColA=substring(T2.ColA from 1 for position('/' in T2.ColA)-1)