NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

General

NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

I have been struggling for about 6 months trying to find a teradata query that will combine consecutive date spans as well as identify overlapping date spans and then combine them into one span..and when there is a gap in spans it will create a new row and show that span and continue etc..so that in the end it will show consecutive and overlapping date spans as one row per span....I am new to Teradata so I will need some specific help in writing this query..Hopefully someone here has already done this and will be able to share the query with me....I do appreciate your help....

John

Tags (1)
25 REPLIES
Supporter

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

can you share a ddl on which your problem is based?

Supporter

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

to give you an idea - there might be more efficient way do do it...

generate some test data

create table date_range _test as (select calendar_Date from_dt , day_of _week + calendar_date to_dt from sys_calendar.calendar where calendar_date between date - 90 and date and random(1,10) < 3) with data primary index (from_dt) 

this would be

from_dt to_dt
2011-12-18 2011-12-19
2011-12-21 2011-12-25
2011-12-31 2012-01-07
2012-01-06 2012-01-12
2012-01-11 2012-01-15
2012-01-16 2012-01-18
2012-01-17 2012-01-20
2012-01-21 2012-01-28
2012-01-23 2012-01-25
2012-01-29 2012-01-30
2012-02-01 2012-02-05
2012-02-06 2012-02-08
2012-02-09 2012-02-14
2012-02-13 2012-02-15
2012-02-17 2012-02-23
2012-02-20 2012-02-22
2012-03-06 2012-03-09
2012-03-07 2012-03-11
2012-03-08 2012-03-13

will be interpreted as periods where to_dt is not included in the period (as in the period data type)

this SQL should do the trick on 13.10

select range, period(min(day_date), max(day_date)+1)
from
(
select day_date, sum(range_start_flag) over (order by day_date rows between unbounded preceding and current row) as range
from
(
select day_date, case when day_date-1 > min(day_date) over (order by day_date rows between 1 preceding and 1 preceding) then 1 when min(day_date) over (order by day_date rows between 1 preceding and 1 preceding) is null then 1 else 0 end as range_start_flag
from
(
select period(from_dt,to_dt) as date_period,begin(tmp) as day_date
from date_range_test
expand on date_period as tmp
) as t1
) as t2
) as t3
group by 1
order by 1

result is - sorry for the different date format, but didn't figured out how to change this right now. Again, period end is excluded...

range    combined_period

1    ('11/12/18', '11/12/19')                                    

2    ('11/12/21', '11/12/25')                                    

3    ('11/12/31', '12/01/15')                                    

4    ('12/01/16', '12/01/20')                                    

5    ('12/01/21', '12/01/28')                                    

6    ('12/01/29', '12/01/30')                                    

7    ('12/02/01', '12/02/05')                                    

8    ('12/02/06', '12/02/08')                                    

9    ('12/02/09', '12/02/15')                                    

10    ('12/02/17', '12/02/23')                                    

11    ('12/03/06', '12/03/13')                                    

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

This is awesome..This is the ouput I got

1 ('2002-12-01', '2003-08-31')

2 ('2003-09-01', '2003-12-31')

3 ('2004-02-01', '2004-02-29')

4 ('2005-07-01', '2005-09-30')

5 ('2005-10-01', '2005-10-31')

6 ('2005-11-01', '2006-02-28')

7 ('2006-03-01', '2006-07-31')

8 ('2006-08-01', '2006-09-30')

9 ('2006-11-01', '2007-09-30')

10 ('2007-10-01', '2008-06-30')

11 ('2008-07-01', '2008-09-30')

12 ('2008-10-01', '2009-02-28')

13 ('2009-03-01', '2009-05-31')

14 ('2009-06-01', '2010-07-31')

15 ('2010-08-01', '2010-12-31')

16 ('2011-03-01', '9999-12-31')

However I need it to consolidate the consecutive spans and break and start a new row when the spans break

and I need the output in 2 seperate columns .....for instance....

EFFECTIVE      ENDDATE

12/1/2002      12/31/2003

2/1/2004        2/29/2004

7/1/2005        9/30/2006

11/1/2006      12/31/2010

3/1/2011        12/31/9999

If you can revise so that my output is consolidated and in two columns...we have a WINNER !!!

Thank you so much

Junior Contributor

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

Hi Ulrich/John,

if you're on TD13.10 instead of EXPAND better use one of those hardly known, but quite powerful new functions. Then it's really easy :-)

WITH cte(grpcol,pd) AS
(
SELECT grpcol, PERIOD(from_dt,to_dt)
FROM date_range_test
)
SELECT *
FROM TABLE
( TD_SYSFNLIB.TD_NORMALIZE_OVERLAP_MEET(NEW VARIANT_TYPE(cte.grpcol), cte.pd)
RETURNS (grpcol INT, pd PERIOD(DATE), cnt INT)
HASH BY grpcol
LOCAL ORDER BY grpcol, pd
) AS dt
ORDER BY 1,2;

If you're on an older release you could do it with some nested OLAP functions, too, e.g.:

http://www.orafaq.com/usenet/comp.databases.theory/2004/12/04/0111.htm

There's a more current/efficient solution, too, but i couldn't find it right now...

Dieter

Junior Contributor

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

Hi John,

based on your previous post you're on TD13.10, i just noticed that :-)

Periods include the starting date, but exclude the end date, you probably have to use PERIOD(from_dt, to_dt + 1).

And BEGIN(pd), LAST(pd) to get seperate columns again.

Dieter

Supporter

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

Thanks Dieter,

these had been on my list to explore for some time. I am impressed.

Run this on a 1.3 mio row table.

The normed cpu is 1/3 of the solution I outlined. And the ellapsed time was even faster - on a single user machine.

The syntax need some time to get used to it but seems to be definitly worth.

Thanks again!

Junior Contributor

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

Hi Ulrich,

yep, these functions are great, yet they're hardly known :-)

And the manuals don't provide any details.

When you get rid of the EXPAND (which might result in a potentially huge intermediate spool) your solution will be more efficient, too.

This is the solution i was talking about, which also runs pre-TD13, and performance is quite good: 

SELECT grpcol, from_dt, 
COALESCE(MAX(x)
OVER (PARTITION BY grpcol
ORDER BY from_dt
ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING), x2)
FROM
(
SELECT grpcol, from_dt,
MAX(to_dt)
OVER (PARTITION BY grpcol
ORDER BY from_dt,to_dt
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS x,
MAX(to_dt)
OVER (PARTITION BY grpcol) AS x2
FROM date_range_test
QUALIFY from_dt > x OR x IS NULL
) AS dt

Btw, using those period functions: when to_dt must be included it's not only PERIOD(from_dt, to_dt + 1) and LAST(pd) , the function must change to TD_NORMALIZE_OVERLAP, too.

Dieter

Enthusiast

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

I'm having problems when trying to use the normalize funcitons with varchar grouping columns.

Given this table:

CREATE SET TABLE JLEE80.cte_base

(

Pstncd VARCHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC,

EffPer PERIOD(DATE))

PRIMARY INDEX ( Pstncd )

;

This code generates an error ("The output grouping column list does not match with input grouping list):

WITH cte(pstncd, effper) AS

(

SELECT pstncd , effper

FROM cte_base

)

SELECT *

FROM TABLE

( TD_SYSFNLIB.TD_NORMALIZE_OVERLAP_meet(new Variant_type(cte.pstncd), cte.effper)

RETURNS (pstncd varchar(8), effper period (date))

HASH BY pstncd

LOCAL ORDER BY pstncd, effper

) AS DT

;

However, this code runs fine:

WITH cte(pstncd, effper) AS

(

SELECT cast(pstncd as int) , effper

FROM cte_base

)

SELECT *

FROM TABLE

( TD_SYSFNLIB.TD_NORMALIZE_OVERLAP_meet(new Variant_type(cte.pstncd), cte.effper)

RETURNS (pstncd int, effper period (date))

HASH BY pstncd

LOCAL ORDER BY pstncd, effper

) AS DT

;

Any clues as to what is happening?  Do the period normalize functions not support varchar grouping columns?

Thanks.

Enthusiast

Re: NEED HELP !! - Merging Consecutive and OVERLAPPING date spans

Dieter, I have a related problem where I'm attempting to aggregate the counts within the following rows where their dates overlap:

ID Count StartDate EndDate

1 1 31/10/2006 00:00:00 31/12/9999 23:59:59

1 1 17/08/2009 00:00:00 31/12/9999 23:59:59

Into the following output:

ID Count StartDate EndDate

1 1 31/10/2006 00:00:00 16/08/2009 23:59:59

1 2 17/08/2009 00:00:00 31/12/9999 23:59:59

Can you assist?