Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

Database
N/A

Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

Hi All,

I'm an ETL QA working on Teradata for last one year. I wanted to validate a view that is being derived from more than one SCD type2 tables. So, I have to create a date ranges based on the combination of dates available in the two(or more) tables. see the example below.

Table 1:

hid  h1name         h1_strt_dttm                  h1_end_dttm

5     value5          2/17/2008  18:19:50    2/16/2009 23:59:59:59

5     value5-upd1 2/17/2009  00:00:00    4/11/2011 23:59:59

5     value5-upd2 4/12/2011  00:00:00    3/5/2012 23:59:59

5     value5-upd3 3/6/2012   00:00:00    11/11/2012 23:59:59

5     value5-upd4 11/12/2012 00:00:00 12/31/9999 23:59:59

Table 2:

hid  h2name        h2_strt_dttm           h2_end_dttm

5   name5          3/7/2008 8:14:44     7/17/2010 23:59:59

5  name5-upd1 7/18/2010 0:00:00   1/1/2011 23:59:59

5  name5-upd2 1/2/2011 0:00:00     3/5/2013 23:59:59

5  name5-upd3 3/6/2013 0:00:00     12/31/9999 23:59:59

Result Set : 

id   strt_dttm                      end_dttm                 h1name               h2name

5   2/17/2008 18:19:50    3/7/2008 8:14:43       value5                  null

5   3/7/2008 8:14:44        2/16/2009 23:59:59   value5                name5

5   2/17/2009 00:00:00    7/17/2010 23:59:59   value5-upd1       name5

5   7/18/2010 00:00:00    1/1/2011 23:59:59     value5-upd1  name5-upd1

5   1/2/2011 00:00:00      4/11/2011 23:59:59   value5-upd1  name5-upd2

5   4/12/2011 00:00:00    3/5/2012 23:59:59     value5-upd2  name5-upd2

5   3/6/2012 00:00:00      11/11/2012 23:59:59 value5-upd3  name5-upd2

5   11/12/2012 00:00:00  3/5/2013 23:59:59     value5-upd4  name5-upd2

5   3/6/2013 00:00:00      1/1/9999 00:00:00     value5-upd4  name5-upd3

Steps to achieve this : My approach:

1) Take a union of all the available STRT_DTTM field values in the two(or more) tables.

2) Distribute/recreate the date ranges from the above dates available in step 1. (as start date and and end date) 

3) Then try to identify the CDC fields(h1name & h2name ) with in the newly created date ranges and put them in the result set.

Queries :

Step 1: 

sel h1.hid hid , h1_strt_dttm dtm 

from  HIST1_DAR h1

union

sel h2.hid hid ,h2_strt_dttm dtm

from  HIST2_DAR h2

Step 2:

/***************  Step 2 ********************/

WITH ALL_DATE_CTE (hid,dttm,strt_dttm,end_dttm) AS (

sel hid, 

dtm+INTERVAL '0.000001' SECOND  as dttm ,

dttm-INTERVAL '0.000001' SECOND as strt_dttm,

coalesce(min(strt_dttm-INTERVAL '1' SECOND) over (partition by hid 

order by dttm  ROWS BETWEEN 1 following 

AND 1 following ), cast('9999-01-01 00:00:00.000000' as timestamp(6)) ) as end_dttm

from 

(

/***************  Step 1 ********************/ 

sel h1.hid hid , h1_strt_dttm dtm 

from  DBNAME.HIST1_DAR h1

union 

sel h2.hid hid ,h2_strt_dttm dtm

from  DBNAME.HIST2_DAR h2

/***************  Step 1 ********************/  ) a

order by 1,2  )

/**************** Step 2********************/


Step 3:

/**************** Step 3********************/

sel

h1.hid id,

cte.strt_dttm,

cte.end_dttm,

h1.h1name, 

h1_strt_dttm , h1_end_dttm,

h2.h2name,

h2_strt_dttm,h2_end_dttm

from DBNAME.HIST1_DAR h1

inner join ALL_DATE_CTE cte

on h1.hid = cte.hid

and cte.dttm between h1_strt_dttm and h1_end_dttm

inner join DBNAME.HIST2_DAR h2

on cte.hid = h2.hid

and cte.dttm between h2_strt_dttm and h2_end_dttm

where h1.hid = 5;

/**************** Step 3********************/

Now the issue is with Step 2: I'm unable to use order by in sub query as well as in CTE. with ordering the start date I'm unable to get the date ranges listed properly. 

Can anyone please guide me for a solution or an other simple approach. Please note as I'm a QA I don't have create table access on most of the databases. So, any suggestion using select would be more appreciable. 

 I have included the DDL & DML that I have used for the table creation. 

create table DBNAME. hist1_dar ( hid integer , h1name varchar(20),  h1_strt_dttm timestamp(6),  h1_end_dttm timestamp(6));

create table DBNAME. hist2_dar ( hid integer , h2name varchar(20), h2_strt_dttm timestamp(6), h2_end_dttm timestamp(6));

ins DBNAME. hist1_dar (1, 'value1' , '1987-07-02 08:30:00.000000', '1989-12-02 23:59:59.000000');

ins DBNAME. hist1_dar (2, 'value2' , '1997-09-12 05:20:10.000000' , '1999-12-12 23:59:59.000000');

ins DBNAME. hist1_dar (3, 'value3' , '2000-06-30 07:10:50.000000', '2001-02-10 23:59:59.000000');

ins DBNAME. hist1_dar (4, 'value4' , '2006-05-23 23:32:13.000000', '2007-07-03 23:59:59.000000');

ins DBNAME. hist1_dar (5, 'value5' , '2009-02-17 18:19:50.000000', '2011-04-11 11:09:59.000000' );

ins DBNAME. hist1_dar (1, 'value1-upd1' , '1990-01-01 01:01:01.000000', '9999-12-31 23:59:59.000000');

ins DBNAME. hist1_dar (2, 'value2-upd1' , '2000-01-01 01:01:01.000000' , '9999-12-31 23:59:59.000000');

ins DBNAME. hist1_dar (3, 'value3-upd1' , '2001-02-11 01:01:01.000000', '2002-07-12 23:59:59.000000');

ins DBNAME. hist1_dar (3, 'value3-upd2' , '2002-07-13 01:01:01.000000', '9999-12-31 23:59:59.000000');

ins DBNAME. hist1_dar (4, 'value4-upd1' , '2007-07-03 01:01:01.000000' , '2007-10-17 23:59:59.000000' );

ins DBNAME. hist1_dar (4, 'value4-upd2' , '2007-10-18 01:01:01.000000' , '2007-12-23 23:59:59.000000' );

ins DBNAME. hist1_dar (4, 'value4-upd3' , '2007-12-24 01:01:01.000000' , '9999-12-31 23:59:59.000000');

ins DBNAME. hist1_dar (5, 'value5-upd1' , '2009-02-17 01:01:01.000000', '2011-04-11 23:59:59.000000' );

ins DBNAME. hist1_dar (5, 'value5-upd2' , '2011-04-11 01:01:01.000000', '2012-03-05 23:59:59.000000' );

ins DBNAME. hist1_dar (5, 'value5-upd3' , '2012-03-05 01:01:01.000000', '2012-11-11 23:59:59.000000' );

ins DBNAME. hist1_dar (5, 'value5-upd4' , '2012-11-12 01:01:01.000000' , '9999-12-31 23:59:59.000000');

ins DBNAME. hist2_dar (1, 'name1' , '1988-08-12 18:20:10.000000', '1990-03-02 23:59:59.000000');

ins DBNAME. hist2_dar (2, 'name2' , '1993-05-19 15:12:30.000000', '1999-11-01 23:59:59.000000');

ins DBNAME. hist2_dar (3, 'name3' , '2002-09-20 17:19:57.000000', '2003-04-10 23:59:59.000000');

ins DBNAME. hist2_dar (4, 'name4' , '2004-04-18 13:38:23.000000' , '2005-07-28 23:59:59.000000');

ins DBNAME. hist2_dar (5, 'name5' , '2008-03-07 08:14:44.000000', '2010-07-17 23:59:59.000000');

ins DBNAME. hist2_dar (1, 'name1-upd1' , '1990-03-03 01:01:01.000000', '9999-12-31 23:59:59.000000');

ins DBNAME. hist2_dar (2, 'name2-upd1' , '1999-11-02 01:01:01.000000', '9999-12-31 23:59:59.000000');

ins DBNAME. hist2_dar (3, 'name3-upd1' , '2003-04-11 01:01:01.000000' , '9999-12-31 23:59:59.000000');

ins DBNAME. hist2_dar (4, 'name4-upd1' , '2005-07-28 21:28:56.000000' , '2008-08-19 23:59:59.000000' );

ins DBNAME. hist2_dar (4, 'name4-upd2' , '2008-08-20 01:01:01.000000' , '2007-12-23 23:59:59.000000' );

ins DBNAME. hist2_dar (5, 'name5-upd1' , '2010-07-18 01:01:01.000000', '2011-01-01 23:59:59.000000' );

ins DBNAME. hist2_dar (5, 'name5-upd2' , '2011-01-02 01:01:01.000000', '2013-03-05 23:59:59.000000' );

ins DBNAME. hist2_dar (5, 'name5-upd3' , '2013-03-06 01:01:01.000000', '9999-12-31 23:59:59.000000' );


6 REPLIES
Teradata Employee

Re: Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

Would it be feasible to write an SP for you/your team?

N/A

Re: Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

Not sure about it but I will try to get access for the same if I get a soultion by writing a Stored Procedure. If you have any inputs please let me know.

N/A

Re: Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

What's your Teradata release?

N/A

Re: Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

VERSION 14.10.03.04

N/A

Re: Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

Your INSERTs don't match your example.

If your step #2 returns the correct data you can simply add a LAST_VALUE:

WITH cte (
hid
,strt_dttm
,end_dttm
,h1name
,h2name
)
AS (
SELECT hid
,dttm
,coalesce(min(dttm - INTERVAL '1' SECOND) OVER (
PARTITION BY hid ORDER BY dttm ROWS BETWEEN 1 following
AND 1 following
), TIMESTAMP '9999-01-01 00:00:00') AS end_dttm
,h1name
,h2name
FROM (
/*************** Step 1 ********************/
SELECT h1.hid hid
,h1_strt_dttm dttm
,h1name
,cast(NULL AS VARCHAR(20)) AS h2name
FROM HIST1_DAR h1

UNION ALL

SELECT h2.hid hid
,h2_strt_dttm dttm
,cast(NULL AS VARCHAR(20))
,h2name
FROM HIST2_DAR h2
/*************** Step 1 ********************/
) a
--order by 1,2
)
SELECT hid
,strt_dttm
,end_dttm
,last_value(h1name ignore nulls) OVER (
PARTITION BY hid ORDER BY strt_dttm
)
,last_value(h2name ignore nulls) OVER (
PARTITION BY hid ORDER BY strt_dttm
)
FROM cte
N/A

Re: Need help to write an SQL to validate mutiple date ranges combined from more than one SCD type2 tables.

Thanks a lot for your reply. It's working. Apologies for the incorrect insert statements.