Hi all,
I am working with a table that contains versions of loan contracts. Any change in an attribute leads to a new row/version. However, I am only interested in a part of the attributes. Therefore I would like to compare those interesting attributes with the previous row to see if there is a change. If not, I would like to ignore that version.
In the example below, I am only interested in changes to the limit so I want to select rows 2,3 and 4, but not 5 and 6.
I tried to achieve this with QUALIFY ROW_NUMBER() OVER (PARTITION BY Loan_ID, Limit ORDER BY Start_date) = 1,
but because row 2 and 4 are the same, it did not select row 4.
Any ideas?
Solved! Go to Solution.
If you also want to see the first row from your data set (where there is no previous row), try:
SELECT a.* FROM vt1 AS a QUALIFY contract_limit <> FIRST_VALUE(contract_limit) OVER(ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) OR FIRST_VALUE(contract_limit) OVER(ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) IS NULL ORDER BY 1,2;
Dave
Hi,
Try this:
SELECT a.* FROM vt1 AS a QUALIFY contract_limit <> FIRST_VALUE(contract_limit) OVER(ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) ORDER BY 1,2;
Cheers,
Dave
The answer above is only looking at the 'limit' column, extend it to look at other columns.
HTH
Dave
If you also want to see the first row from your data set (where there is no previous row), try:
SELECT a.* FROM vt1 AS a QUALIFY contract_limit <> FIRST_VALUE(contract_limit) OVER(ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) OR FIRST_VALUE(contract_limit) OVER(ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) IS NULL ORDER BY 1,2;
Dave
In your SQL, you need to use your version begin and end dates to insure you are only comparing a new version to the previous version, and not all other versions.
I don't know about 'previous version' the original question was about the previous row (when sorted by start_date) and that is what the "order by start_date" is intended to do.
Cheers,
Dave
Well, how do you apply versioningso it will join to other tables? If you use a version begin and end date range, the previous version should end just prior to the new version beginning (see below). Good luck
acct_id dw_beg_dt dw_end_dt
TM6525395 1111-01-01 2007-03-31
TM6525395 2007-04-01 2007-06-30
TM6525395 2007-07-01 2008-07-31
TM6525395 2008-08-01 5555-12-31
Hi,
I quite agree with you, applying a 'version' to each row is common and probably necessary for joins - as you mention.
I was simply trying to respond to the original question without making assumptions about what other columns there might be in the table.
Let's see whether the original problem is solved, and if not we can take it from there.
Cheers,
Dave
Thank you! This answer was the clue I needed.
I had never heard about FIRST_VALUE before. It took me all day and the discovery of LAST_VALUE and IGNORE NULLS to piece together the complete solution but I got it.
Albeit an inefficient one I suspect, using two intermediate tables. Any suggestion how to improve the select below?
CREATE TABLE vcontract_status,FALLBACK,NO BEFORE JOURNAL,NO AFTER JOURNAL( Loan_ID INTEGER NOT NULL ,Contract_status INTEGER ,Irrelevant_attribute CHAR(1) ,Contract_status_start_date DATE ,Contract_status_end_date DATE)PRIMARY INDEX( Loan_ID ); INSERT INTO vcontract_status VALUES (1,10,'A',1170101,1170131); INSERT INTO vcontract_status VALUES (1,20,'B',1170201,1170228); INSERT INTO vcontract_status VALUES (1,10,'C',1170301,1170331); INSERT INTO vcontract_status VALUES (1,10,'D',1170401,NULL); INSERT INTO vcontract_status VALUES (22,10,'A',1170101,1170331); INSERT INTO vcontract_status VALUES (22,20,'B',1170401,NULL); CREATE TABLE vcontract_limit,FALLBACK,NO BEFORE JOURNAL,NO AFTER JOURNAL( Loan_ID INTEGER NOT NULL ,Contract_limit INTEGER ,Irrelevant_attribute CHAR(1) ,Contract_limit_start_date DATE ,Contract_limit_end_date DATE)PRIMARY INDEX( Loan_ID ); INSERT INTO vcontract_limit VALUES (1,1000,'X',1170101,1170214); INSERT INTO vcontract_limit VALUES (1,2000,'X',1170215,1170218); INSERT INTO vcontract_limit VALUES (1,1500,'X',1170219,1170408); INSERT INTO vcontract_limit VALUES (1,1500,'Y',1170409,NULL); INSERT INTO vcontract_limit VALUES (22,500,'Z',1170409,NULL); WITH cte2 AS( SELECT * FROM cte1 QUALIFY contract_status <> FIRST_VALUE(contract_status) OVER(PARTITION BY loan_id ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) OR contract_limit <> FIRST_VALUE(contract_limit) OVER(PARTITION BY loan_id ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) OR ROW_NUMBER() OVER(PARTITION BY loan_id ORDER BY start_date) = 1 OR (contract_status IS NOT NULL AND FIRST_VALUE(contract_status) OVER(PARTITION BY loan_id ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) IS NULL) OR (contract_limit IS NOT NULL AND FIRST_VALUE(contract_limit) OVER(PARTITION BY loan_id ORDER BY start_date ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) IS NULL)),cte1 AS( SELECT COALESCE(t1.loan_id,t2.loan_id) AS Loan_ID ,COALESCE(t1.contract_status_start_date,t2.contract_limit_start_date) AS Start_date ,COALESCE(contract_status,LAST_VALUE(contract_status ignore NULLS) OVER(PARTITION BY COALESCE(t1.loan_id,t2.loan_id) ORDER BY COALESCE(t1.contract_status_start_date,t2.contract_limit_start_date) ROWS BETWEEN 99 PRECEDING AND 1 PRECEDING)) AS contract_status ,COALESCE(contract_limit,LAST_VALUE(contract_limit ignore NULLS) OVER(PARTITION BY COALESCE(t1.loan_id,t2.loan_id) ORDER BY COALESCE(t1.contract_status_start_date,t2.contract_limit_start_date) ROWS BETWEEN 99 PRECEDING AND 1 PRECEDING)) AS contract_limit FROM vcontract_status t1 FULL OUTER JOIN vcontract_limit t2 ON t1.loan_id= t2.loan_id AND contract_status_start_date = contract_limit_start_date)SELECT loan_id,contract_status,contract_limit,start_date ,MAX(start_date) OVER (PARTITION BY Loan_ID ORDER BY start_date ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)-1 AS end_date FROM cte2 ORDER BY 1,4