Hello All,
I am trying to achieve the following scenario in Teradata.
Kindly advise.
I have two tables
first table (it will have millons of rows in it):
======================
col1 col2 col3
---- ---- ----
1 A ABC
1 A GHI
1 A DEF
second table (it is reference table):
======================
col1 col2 winner
----- ---- ------
ABC DEF DEF
GHI ABC ABC
DEF XYZ XYZ
By using these two tables
i want to get output like below
col1 col2 col3
---- ---- ----
1 A DEF
if we see 1 st table we have 3 rows (No.of rows may vary). for same col1,col2 values we have different col3 value. out of 3 rows, i've to get 1 row
based on col3 values.
In second table we have different combinations of col3 values of first table and WINNER of that combination. if we don't find any combination then
select winner randomly.
in ouput based on winners in second table, i have to get 1 row from 1st table for col1,col2 values.
for the above example, i want to get output as below
col1 col2 col3
---- ---- ----
1 A DEF
Could anyone please help me what could be the best approach to get the desired output?
Thanks in Advance,
Kumar.
Hi Kumar,
What is your business rule to determine what gets classed as the 'winner'?
In your first example you say the output is "1 A DEF". Why is is 'DEF'? Why not 'ABC' or 'XYZ' - which are also values in column 'winner' in your second table.
Cheers,
Dave
Hi Dave,
Thanks for your Reply.
Business people will define which one is the winner based on their requirements. At the end they will provide the combinations and winners for us. it will be act as reference table for me. Based on reference table i've to get the output.
Hopefully, this explains.
Thanks,
Kumar.
Also,
to answer your below question,
"In your first example you say the output is "1 A DEF". Why is is 'DEF'? Why not 'ABC' or 'XYZ' - which are also values in column 'winner' in your second table."
Why the output say 1 A DEF is,
If you see my example, in first table i have 3 values for col3 which are ABC , GHI and DEF.
if we see the reference table,
for combination ABC and GHI winner is ' ABC'.
Then i've to compare the previous winner 'ABC' with the next value 'DEF'. so for combination ABC and DEF the winner is 'DEF'.
so output i will get record in 1st table with DEF as col3 value.
i.e. 1 A DEF.
Hopefully, this explains.
Thanks,
Kumar.
Hi Kumar,
I understand the use of a reference table, that makes perfect sense to me. It looks like a good design.
I'm still confused by the rest of your logic though.
Why do you compare "combination ABC and GHI winner is ' ABC'." first? Is this because they are the first two rows (for col1=1 and col2=A) in your 'first table'?
- how are they determined to be the first two rows? Is/Are there another column(s) that are not being shown?
- they're not the first two based on data value of col3 (because DEF comes before GHI).
Cheers,
Dave
Thanks for your reply Dave.
To answer,
"Why do you compare "combination ABC and GHI winner is ' ABC'." first? Is this because they are the first two rows (for col1=1 and col2=A) in your 'first table'?' "
-- I would like to compare row by row. like 1st row with 2nd row. then take output and compare the with 3rd row till N rows.
"how are they determined to be the first two rows? Is/Are there another column(s) that are not being shown?"
-- Just i will use row_number with column 1,col2. it will random priority. So the order may be change by time. So whenever i execute i will just compare one row by another row.
"they're not the first two based on data value of col3 (because DEF comes before GHI)"
-- Yes Correct.
Apologies, still its not explainable.
Thanks,
Anil.
Hi Anil,
I think the following will work for you. Note that I've changed your sample data slightly, but i think it is valid. Give it a go and see what happens.
First, set up the data:
CREATE SET VOLATILE TABLE first_table (key1 INTEGER ,key2 CHAR(1) ,col3 CHAR(3) ) PRIMARY INDEX(key1,key2) ON COMMIT PRESERVE ROWS; INSERT INTO first_table VALUES(1,'A','ABC'); INSERT INTO first_table VALUES(1,'A','GHI'); INSERT INTO first_table VALUES(1,'A','DEF'); INSERT INTO first_table VALUES(1,'A','MMM'); DROP TABLE ref_table; CREATE SET VOLATILE TABLE ref_table (col1 CHAR(3) ,col2 CHAR(3) ,winner CHAR(3) ) PRIMARY INDEX(col1,col2) ON COMMIT PRESERVE ROWS; DELETE FROM ref_table; INSERT INTO ref_table VALUES('ABC','DEF','DEF'); INSERT INTO ref_table VALUES('DEF','GHI','XYZ'); INSERT INTO ref_table VALUES('GHI','ABC','ABC');
For processing, I've then created a temp table (I use volatile, but that is your choice):
DROP TABLE vt_seq; CREATE SET VOLATILE TABLE vt_seq AS (SELECT key1,key2 ,col3 ,ROW_NUMBER() OVER(PARTITION BY key1,key2 ORDER BY col3) AS seqnbr FROM first_table) WITH DATA PRIMARY INDEX(key1,key2) ON COMMIT PRESERVE ROWS;
I've used row-number to sequence the data.
In your last post you say 'random priority'. Do you truly mean 'random' ? And if so how do you know that you get the correct answer?
Then run a recursive query using this temp table:
WITH RECURSIVE temp(key1,key2,r1_col3,r2_col3,seqnbr,winner,depth) AS (SELECT r1.key1 ,r1.key2 ,r1.col3 ,r2.col3 ,r2.seqnbr ,ref.winner ,0 AS depth FROM (vt_seq AS r1 INNER JOIN vt_seq AS r2 ON r1.key1 = r2.key1 AND r1.key2 = r2.key2 AND r1.seqnbr = r2.seqnbr-1) LEFT JOIN ref_table AS ref ON r1.col3 = ref.col1 AND r2.col3 = ref.col2 UNION ALL SELECT temp.key1 ,temp.key2 ,temp.r1_col3 ,r2.col3 ,r2.seqnbr ,ref.winner ,temp.depth+1 AS depth FROM (temp INNER JOIN vt_seq AS r2 ON temp.key1 = r2.key1 AND temp.key2 = r2.key2 AND temp.seqnbr = r2.seqnbr-1) LEFT JOIN ref_table AS ref ON temp.r1_col3 = ref.col1 AND r2.col3 = ref.col2 ) SEL * FROM temp WHERE winner IS NOT NULL QUALIFY seqnbr = MAX(seqnbr) OVER(PARTITION BY key1,key2);
Yes, the recursive temp table has too may columns in it but that was mainly so I could sanity check the results and data.
HTH
Dave
Hi,
Just realised that you'll probably need to change the seed query in the recursive processing to add in the following:
AND r1.seqnbr = 1
So the final query is:
WITH RECURSIVE temp(key1,key2,r1_col3,r2_col3,seqnbr,winner,depth) AS (SELECT r1.key1 ,r1.key2 ,r1.col3 ,r2.col3 ,r2.seqnbr ,ref.winner ,0 AS depth FROM (vt_seq AS r1 INNER JOIN vt_seq AS r2 ON r1.key1 = r2.key1 AND r1.key2 = r2.key2 AND r1.seqnbr = r2.seqnbr-1 AND r1.seqnbr = 1) LEFT JOIN ref_table AS ref ON r1.col3 = ref.col1 AND r2.col3 = ref.col2 UNION ALL SELECT temp.key1 ,temp.key2 ,temp.r1_col3 ,r2.col3 ,r2.seqnbr ,ref.winner ,temp.depth+1 AS depth FROM (temp INNER JOIN vt_seq AS r2 ON temp.key1 = r2.key1 AND temp.key2 = r2.key2 AND temp.seqnbr = r2.seqnbr-1) LEFT JOIN ref_table AS ref ON temp.r1_col3 = ref.col1 AND r2.col3 = ref.col2 ) SEL * FROM temp WHERE winner IS NOT NULL QUALIFY seqnbr = MAX(seqnbr) OVER(PARTITION BY key1,key2)
Cheers,
Dave