Database
Highlighted
Enthusiast

## Finding Continuos Span

Hi,

I have a table with the following data.I want to find the emp nbr's start date and end date.If there is a gap in span which exceeds more than 3 years than i want to find the most continuos span

In the below example the emp was active continuosly from 1/1/2005 to 06/30/2008 with no breaks so the results should be Start Date = 01/01/2005 , End Date = 06/30/2008
Emp Nbr Start Date End Date
1 01/01/2005 03/31/2005
1 04/01/2005 12/31/2005
1 01/01/2006 06/30/2006
1 07/01/2006 12/31/2006
1 01/01/2007 12/31/2007
1 01/01/2008 06/30/2008

In the below example the emp was active from 1/1/1998 to 06/30/2008.There was a break in span
from 12/31/2001 to 01/01/2005 which is more than 3 years so the results should be Start Date = 01/01/2005 , End Date = 06/30/2008
Emp Nbr Start Date End Date
1 01/01/1998 12/31/2001
1 01/01/2005 03/31/2005
1 04/01/2005 12/31/2005
1 01/01/2006 06/30/2006
1 07/01/2006 12/31/2006
1 01/01/2007 12/31/2007
1 01/01/2008 06/30/2008

In the below example the emp was active from 1/1/2002 to 06/30/2008.There was a break in span
from 12/31/2003 to 01/01/2005 which is less than 3 years so the results should be Start Date = 01/01/2003 , End Date = 06/30/2008
Emp Nbr Start Date End Date
1 01/01/2002 12/31/2003
1 01/01/2005 03/31/2005
1 04/01/2005 12/31/2005
1 01/01/2006 06/30/2006
1 07/01/2006 12/31/2006
1 01/01/2007 12/31/2007
1 01/01/2008 06/30/2008

Please let me know how to acheive the above results in a single query in teradata

Thanks
3 REPLIES 3
Enthusiast

## Re: Finding Continuos Span

To be honest I am little confused with requirements (may be because these are the moth hours for me...)

In the second sample shouldn't you be picking the range

01/01/1998 12/31/2001 which is 4 yrs compared to

Start Date = 01/01/2005 , End Date = 06/30/2008
which is only 3.5 yrs ?? so the longest duration is the first one ?

And in the third example your probably meant Start Date = 01/01/2002 , End Date = 06/30/2008 ?? not Start Date = 01/01/2003 , End Date = 06/30/2008

Assuming these are typos, following is a crap query which I hadn't had the time to cleanup much :)

SELECT Y.EMP_NBR, C4.CALENDAR_DATE START_DATE, Y.END_DATE
FROM
(

SELECT X.EMP_NBR, X.END_DATE ,MUL,C3.DAY_OF_CALENDAR
,SUM(MUL*(C1.DAY_OF_CALENDAR - C2.DAY_OF_CALENDAR)) OVER(PARTITION BY EMP_NBR ORDER BY START_DATE ROWS UNBOUNDED PRECEDING)
+(C3.DAY_OF_CALENDAR - C1.DAY_OF_CALENDAR) TOTDAYS
FROM
(
SELECT EMP_NBR
,START_DATE
,END_DATE
,COALESCE(MAX(D.START_DATE) OVER (PARTITION BY D.EMP_NBR ORDER BY D.START_DATE ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING), START_DATE) PREV_START_DATE
,COALESCE(MAX(D.END_DATE) OVER (PARTITION BY D.EMP_NBR ORDER BY D.START_DATE ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING), END_DATE ) PREV_END_DATE
,CASE WHEN ADD_MONTHS(PREV_END_DATE, 36) > D.START_DATE THEN 1 ELSE 0 END MUL
FROM DATA003 D
) X
INNER JOIN
SYS_CALENDAR.CALENDAR C1
ON X.START_DATE = C1.CALENDAR_DATE
INNER JOIN
SYS_CALENDAR.CALENDAR C2
ON X.PREV_START_DATE = C2.CALENDAR_DATE
INNER JOIN
SYS_CALENDAR.CALENDAR C3
ON X.END_DATE = C3.CALENDAR_DATE
) Y

INNER JOIN
SYS_CALENDAR.CALENDAR C4
ON Y.DAY_OF_CALENDAR-Y.TOTDAYS = C4.DAY_OF_CALENDAR
QUALIFY RANK() OVER(PARTITION BY EMP_NBR ORDER BY TOTDAYS DESC) = 1
ORDER BY EMP_NBR, START_DATE
;

Emp_Nbr START_DATE END_DATE
1 01/01/2005 06/30/2008
2 01/01/1998 12/31/2001
3 01/01/2002 06/30/2008
Junior Contributor

## Re: Finding Continuos Span

Hi Samit,
this solution is quite similar to Joe's:
SELECT
emp,
MIN(start_date) AS start_d,
MAX(end_date) AS end_d
FROM
(
SELECT
emp,
start_date,
end_date,
-- check if the previous date passes your condition of "no gap > 3 years" and do a running total calculating a "grp" number of similar rows
SUM(CASE WHEN ADD_MONTHS(prev_end_date, 36) < start_date THEN 1 ELSE 0 END) OVER
(PARTITION BY emp
ORDER BY start_date
ROWS UNBOUNDED PRECEDING) AS grp
FROM
(
SELECT
emp,
start_date,
end_date,
MIN(end_date) OVER
(PARTITION BY emp
ORDER BY end_date
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS prev_end_date
FROM dropme
) dt
) dt
GROUP BY emp, grp
QUALIFY
RANK() OVER (
PARTITION BY emp
ORDER BY end_d - start_d DESC) = 1

Using that approach you can solve almost any problem with OLAP functions :-)

Dieter
Enthusiast

Thank You guys