Grouping Contiguous Data

General
Enthusiast

Grouping Contiguous Data

I have the following table:

CREATE SET TABLE hcclnc_ushare.O884450_table1 ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
per_name VARCHAR(11) CHARACTER SET UNICODE NOT CASESPECIFIC,
start_time TIMESTAMP(0),
end_time TIMESTAMP(0))
PRIMARY INDEX ( per_name );


With the following data:

INSERT INTO hcclnc_ushare.O884450_table1 VALUES ('Jacob Jones', '2012-01-01 15:28:00', '2012-01-01 21:04:00');
INSERT INTO hcclnc_ushare.O884450_table1 VALUES ('Jacob Jones', '2012-01-02 19:34:00', '2012-01-02 22:50:00');
INSERT INTO hcclnc_ushare.O884450_table1 VALUES ('Jacob Jones', '2012-01-03 14:34:00', '2012-01-10 18:04:00');
INSERT INTO hcclnc_ushare.O884450_table1 VALUES ('Jacob Jones', '2012-01-10 14:12:00', '2012-01-11 18:55:00');
INSERT INTO hcclnc_ushare.O884450_table1 VALUES ('Jacob Jones', '2012-01-11 18:55:00', '2012-01-13 11:35:00');
INSERT INTO hcclnc_ushare.O884450_table1 VALUES ('Jacob Jones', '2012-01-14 10:25:00', '2012-01-15 22:25:00');
INSERT INTO hcclnc_ushare.O884450_table1 VALUES ('Jacob Jones', '2012-01-14 12:25:00', NULL);


per_name     start_time                     end time

Jacob Jones  2012-01-01 15:28:00   2012-01-01 21:04:00

Jacob Jones  2012-01-02 19:34:00   2012-01-02 22:50:00

Jacob Jones  2012-01-03 14:34:00   2012-01-10 18:04:00

Jacob Jones  2012-01-10 14:12:00   2012-01-11 18:55:00

Jacob Jones  2012-01-11 18:55:00   2012-01-13 11:35:00

Jacob Jones  2012-01-14 10:25:00   2012-01-15 22:25:00

Jacob Jones  2012-01-14 12:25:00   NULL

I need to identify the contiguous time periods out of this set. So for this example, I want to end up with:

Event     per_name      start_time                     end_time

Event 1 Jacob Jones 2012-01-01 15:28:00  2012-01-01 21:04:00

Event 2 Jacob Jones 2012-01-02 19:34:00  2012-01-02 22:50:00

Event 3 Jacob Jones 2012-01-03 14:34:00  2012-01-13 11:35:00

Event 4 Jacob Jones 2012-01-14 10:25:00  NULL

Thoughts on how to approach this?

Thanks,

Kevin

6 REPLIES
Enthusiast

Re: Grouping Contiguous Data

Enthusiast

Re: Grouping Contiguous Data

I have an initial iteration with an assumption that the data is unique on name and starttime, the below sql should give a starting poing wherever the succesive lines' time overlaps they are flagged as 'Y' . For these rows replace the endtime with the nextline_endtime and start over until there are no "y"s. 

 select evt, nm, sttime, endtime, max(sttime) over (partition by nm order by nm, sttime rows between 1 following and 1 following) nextline_sttime,

max(endtime) over (partition by nm order by nm, sttime rows between 1 following and 1 following) nextline_endtime,

case when (sttime, endtime) overlaps (nextline_sttime, nextline_endtime) then 'Y' else 'N' end next_time_overlaps

from dev.t1








evt nm sttime endtime nextline_sttime nextline_endtime next_time_overlaps
evt1 Jacob Jones 1/1/2012 15:28 1/1/2012 21:04 1/2/2012 19:34 1/2/2012 22:50 N
evt2 Jacob Jones 1/2/2012 19:34 1/2/2012 22:50 1/3/2012 14:34 1/10/2012 18:04 N
evt3 Jacob Jones 1/3/2012 14:34 1/10/2012 18:04 1/10/2012 14:12 1/11/2012 18:55 Y
evt4 Jacob Jones 1/10/2012 14:12 1/11/2012 18:55 1/11/2012 18:55 1/13/2012 11:35 N
evt5 Jacob Jones 1/11/2012 18:55 1/13/2012 11:35 1/14/2012 10:25 1/15/2012 22:25 N
evt6 Jacob Jones 1/14/2012 10:25 1/15/2012 22:25 1/14/2012 12:25 ? Y
evt7 Jacob Jones 1/14/2012 12:25 ? ? ? N
Enthusiast

Re: Grouping Contiguous Data

Thanks, d3V1L, for the link to the related topic. It got me almost there. The only difference is that the solution there didn't give me a null end_time for the last event but instead displayed the maximum for the previous record. However, using a coalesce with current the current time gave me what I needed. My final code looked like this:

SELECT per_name, start_time, 
COALESCE(MAX(x)
OVER (PARTITION BY per_name
ORDER BY start_time
ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING), x2)
FROM
(
SELECT per_name, start_time, end_time,
CURRENT_DATE (FORMAT 'YYYY-MM-DD') (CHAR(10), UC) AS c_date,
CAST(c_date || ' ' || '23:59:59' AS TIMESTAMP(0)) AS c_time,
MAX(end_time)
OVER (PARTITION BY per_name
ORDER BY start_time,end_time
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS x,
MAX(COALESCE(end_time,c_time))
OVER (PARTITION BY per_name) AS x2
FROM hcclnc_ushare.O884450_table1
QUALIFY start_time > x OR x IS NULL
) AS dt

which gave me the following result:

per_name start_time end_time

Jacob Jones 2012-01-01 15:28:00 2012-01-01 21:04:00

Jacob Jones 2012-01-02 19:34:00 2012-01-02 22:50:00

Jacob Jones 2012-01-03 14:34:00 2012-01-13 11:35:00

Jacob Jones 2012-01-14 10:25:00 2012-06-28 23:59:59

Even though it doesn't give me null in the last end_time it gives me what I need. For some reason I can't simply use current_timestamp but the construct I am using for c_time seems to work fine.

-Kevin

Senior Apprentice

Re: Grouping Contiguous Data

Hi Kevin,

why do you think you can't use current_timestamp?

If it's because of the time zone or the fractional digits?

This will get the same datatype as your column:

CAST(CURRENT_TIMESTAMP(0) AS TIMESTAMP(0))

And getting a NULL is just a NULLIF:

NULLIF(COALESCE(MAX(x)

           OVER (PARTITION BY per_name

                 ORDER BY start_time

                 ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING), x2), CURRENT_TIMESTAMP(0))

If your data is guaranteed to always have a NULL in the last row it would be much easier.

No need to calculate x2 and no NULLIF/COALESCE, it's just a plain:

MAX(x)

           OVER (PARTITION BY per_name

                 ORDER BY start_time

                 ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)

Dieter

Enthusiast

Re: Grouping Contiguous Data

Hi,

Could anyone let me know what is the meaning of this error I am getting while trying to run the TD_NORMALIZE_OVERLAP_MEET

Cannot determine the best-fit function for ''TD_NORMALIZE_OVERLAP_MEET''

I have used the same syntax given by Dieter here

http://forums.teradata.com/forum/general/need-help- merging-consecutive-a...

but still its not working.

Thanks

Manik

Enthusiast

Re: Grouping Contiguous Data

Why I can't use TD_SYSFNLIB.TD_NORMALIZE_OVERLAP_MEET?And I also not fund TD_SYSFNLIB(user or database?).My database version is 13.Thks for anyone who answer me.