Find average time of day

Analytics
Enthusiast

Find average time of day

I have code with hours of day. I want to find the average of day but if one time is hour 22 another is 2 then the average with be 12 noon but I want to get 0 midnight as it is the closest time between the two hours. How do I find the average and take midnight wrap around into account???

select purchase,

CAST(AVG(EXTRACT(HOUR FROM opened_at)) AS INT) as average_open
from db.purchasetable
group by 1

Tags (1)
5 REPLIES
Junior Contributor

Re: Find average time of day

You can't do an average on time, but it's allowed for interval :-)

Subtracting two times return an interval:

AVG(CAST(opened_at AS TIME(0))            -- extract only the time portion
- TIME '12:00:00' HOUR TO SECOND(0)) -- subtract 12 hours to center at midnight (returns an interval)
+ TIME '00:00:00' -- change it back to a time

If you still want an hour as result you can simply extract it.

Enthusiast

Re: Find average time of day

Hi Dieter,

I've been looking into this more and I have a workaround which involves a time reference table.

There are points in a plot which map out 1-24 hours in a circle.

The x and y values give the mapping for the plot.

This is the insert for that mapping:

insert into sandbox.jh_time_ref values (18,0,0.5);
insert into sandbox.jh_time_ref values (17,0.0238095238095238,0.357142857142857);
insert into sandbox.jh_time_ref values (16,0.0714285714285714,0.238095238095238);
insert into sandbox.jh_time_ref values (15,0.142857142857143,0.142857142857143);
insert into sandbox.jh_time_ref values (14,0.238095238095238,0.0714285714285714);
insert into sandbox.jh_time_ref values (13,0.357142857142857,0.0238095238095238);
insert into sandbox.jh_time_ref values (12,0.5,0);
insert into sandbox.jh_time_ref values (11,0.642857142857143,0.0238095238095238);
insert into sandbox.jh_time_ref values (10,0.761904761904762,0.0714285714285714);
insert into sandbox.jh_time_ref values (9,0.857142857142857,0.142857142857143);
insert into sandbox.jh_time_ref values (8,0.928571428571428,0.238095238095238);
insert into sandbox.jh_time_ref values (7,0.976190476190476,0.357142857142857);
insert into sandbox.jh_time_ref values (6,1,0.5);
insert into sandbox.jh_time_ref values (5,0.976190476190476,0.642857142857143);
insert into sandbox.jh_time_ref values (4,0.928571428571428,0.761904761904762);
insert into sandbox.jh_time_ref values (3,0.857142857142857,0.857142857142857);
insert into sandbox.jh_time_ref values (2,0.761904761904762,0.928571428571428);
insert into sandbox.jh_time_ref values (1,0.642857142857143,0.976190476190476);
insert into sandbox.jh_time_ref values (0,0.5,1);
insert into sandbox.jh_time_ref values (23,0.357142857142857,0.976190476190476);
insert into sandbox.jh_time_ref values (22,0.238095238095238,0.928571428571428);
insert into sandbox.jh_time_ref values (21,0.142857142857143,0.857142857142857);
insert into sandbox.jh_time_ref values (20,0.0714285714285714,0.761904761904762);
insert into sandbox.jh_time_ref values (19,0.0238095238095238,0.642857142857143);

Then I reference this table like so:

select avg(x),avg(y)
from sandbox.jh_time_tests2 a
left join sandbox.jh_time_ref b on b.hours=CAST(EXTRACT(HOUR FROM open_ts) AS INT)

The test table is simply two inserts in a table:

insert into sandbox.jh_time_tests2 values ('2015-04-26 22:00:00','2015-04-26');
insert into sandbox.jh_time_tests2 values ('2015-04-26 02:00:00','2015-04-26');

The result of the averaging gives 

Average(x) Average(y)

0.5            0.93

When plotting this additional point on the time reference plot it is exactly where I want it to be.

Although, from the back of this I have another question.

Is there a way of finding the closest point from this new point to the points in the reference table?


Junior Contributor

Re: Find average time of day

Hi John,

Getting the closest point is a bit tricky, how do you define "closest"?

What result should be returned for hours 10 and 14 (or 6 and 18), Midnight or noon?

What's your TD release, are the geospatial extensions available?

SELECT TOP 1 tr.hours, NEW ST_GEOMETRY('ST_Point',tr.x,tr.y).ST_Distance(p) AS distance
FROM jh_time_ref AS tr
CROSS JOIN
(
SELECT NEW ST_GEOMETRY('ST_Point',AVG(x),AVG(y)) AS p
FROM jh_time_tests2 a
LEFT JOIN jh_time_ref b ON b.hours=CAST(EXTRACT(HOUR FROM open_ts) AS INT)
) AS dt
ORDER BY distance

Btw, plotting your 24 hours results in sinus/cosinus curves, so you could replace the join to jh_time_ref with some calculations :-)

SELECT TOP 1 tr.hours, NEW ST_GEOMETRY('ST_Point',tr.x,tr.y).ST_Distance(p) AS distance
FROM jh_time_ref AS tr
CROSS JOIN
(
SELECT NEW ST_GEOMETRY('ST_Point',AVG(SIN(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5 )
,AVG(COS(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5)) AS p
FROM jh_time_tests2 a
) AS dt
ORDER BY distance

With SIN/COS you could include minutes and seconds in your calculation, too...


Enthusiast

Re: Find average time of day

Hi Dieter,

Your second code snippet is very helpful.

I've added minutes and it's coming back with the expected results.

In regards to the distance, I didn't explain that well.

I'm not looking for the distance between two given points. I'm looking for the point at which a line intersects with the circle given the start point as being the center of the circle and the second point being the average or all ther other points.

So, if I have a timestamp with 22:00:00 (-.5,.87), a second with 02:00:00(.5,.87). Then the midpoint will be (0,.87). However, this does not plot with a point on the circle. In this example I know the point should be (0,1), but is there a formula which gives this intersection?

Anyway, I'm not expecting you to answer this, as your points above have been very helpful!

Thanks a lot Dieter, you're a legend! 

Junior Contributor

Re: Find average time of day

Hi John,

the distance between two points can easily be calculated using Pythagoras' theorem, no need for geospatial :-)

SELECT
tr.hours
,SQRT(((tr.x-dt.x)**2) + ((tr.y-dt.y)**2)) AS distance
FROM jh_time_ref AS tr
CROSS JOIN
(
SELECT
AVG(SIN(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5) AS x
,AVG(COS(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5) AS y
FROM jh_time_tests2 a
) AS dt
QUALIFY ROW_NUMBER() OVER (ORDER BY distance) = 1