Find average time of day

Analytics
Enthusiast

Find average time of day

I have code with hours of day. I want to find the average of day but if one time is hour 22 another is 2 then the average with be 12 noon but I want to get 0 midnight as it is the closest time between the two hours. How do I find the average and take midnight wrap around into account???

select purchase,

CAST(AVG(EXTRACT(HOUR FROM opened_at)) AS INT) as average_open
from db.purchasetable
group by 1

Tags (1)
11 REPLIES
Junior Contributor

Re: Find average time of day

You can't do an average on time, but it's allowed for interval :-)

Subtracting two times return an interval:

AVG(CAST(opened_at AS TIME(0))            -- extract only the time portion
- TIME '12:00:00' HOUR TO SECOND(0)) -- subtract 12 hours to center at midnight (returns an interval)
+ TIME '00:00:00' -- change it back to a time

If you still want an hour as result you can simply extract it.

Enthusiast

Re: Find average time of day

Hi Dieter,

I've been looking into this more and I have a workaround which involves a time reference table.

There are points in a plot which map out 1-24 hours in a circle.

The x and y values give the mapping for the plot.

This is the insert for that mapping:

insert into sandbox.jh_time_ref values (18,0,0.5);
insert into sandbox.jh_time_ref values (17,0.0238095238095238,0.357142857142857);
insert into sandbox.jh_time_ref values (16,0.0714285714285714,0.238095238095238);
insert into sandbox.jh_time_ref values (15,0.142857142857143,0.142857142857143);
insert into sandbox.jh_time_ref values (14,0.238095238095238,0.0714285714285714);
insert into sandbox.jh_time_ref values (13,0.357142857142857,0.0238095238095238);
insert into sandbox.jh_time_ref values (12,0.5,0);
insert into sandbox.jh_time_ref values (11,0.642857142857143,0.0238095238095238);
insert into sandbox.jh_time_ref values (10,0.761904761904762,0.0714285714285714);
insert into sandbox.jh_time_ref values (9,0.857142857142857,0.142857142857143);
insert into sandbox.jh_time_ref values (8,0.928571428571428,0.238095238095238);
insert into sandbox.jh_time_ref values (7,0.976190476190476,0.357142857142857);
insert into sandbox.jh_time_ref values (6,1,0.5);
insert into sandbox.jh_time_ref values (5,0.976190476190476,0.642857142857143);
insert into sandbox.jh_time_ref values (4,0.928571428571428,0.761904761904762);
insert into sandbox.jh_time_ref values (3,0.857142857142857,0.857142857142857);
insert into sandbox.jh_time_ref values (2,0.761904761904762,0.928571428571428);
insert into sandbox.jh_time_ref values (1,0.642857142857143,0.976190476190476);
insert into sandbox.jh_time_ref values (0,0.5,1);
insert into sandbox.jh_time_ref values (23,0.357142857142857,0.976190476190476);
insert into sandbox.jh_time_ref values (22,0.238095238095238,0.928571428571428);
insert into sandbox.jh_time_ref values (21,0.142857142857143,0.857142857142857);
insert into sandbox.jh_time_ref values (20,0.0714285714285714,0.761904761904762);
insert into sandbox.jh_time_ref values (19,0.0238095238095238,0.642857142857143);

Then I reference this table like so:

select avg(x),avg(y)
from sandbox.jh_time_tests2 a
left join sandbox.jh_time_ref b on b.hours=CAST(EXTRACT(HOUR FROM open_ts) AS INT)

The test table is simply two inserts in a table:

insert into sandbox.jh_time_tests2 values ('2015-04-26 22:00:00','2015-04-26');
insert into sandbox.jh_time_tests2 values ('2015-04-26 02:00:00','2015-04-26');

The result of the averaging gives 

Average(x) Average(y)

0.5            0.93

When plotting this additional point on the time reference plot it is exactly where I want it to be.

Although, from the back of this I have another question.

Is there a way of finding the closest point from this new point to the points in the reference table?


Junior Contributor

Re: Find average time of day

Hi John,

Getting the closest point is a bit tricky, how do you define "closest"?

What result should be returned for hours 10 and 14 (or 6 and 18), Midnight or noon?

What's your TD release, are the geospatial extensions available?

SELECT TOP 1 tr.hours, NEW ST_GEOMETRY('ST_Point',tr.x,tr.y).ST_Distance(p) AS distance
FROM jh_time_ref AS tr
CROSS JOIN
(
SELECT NEW ST_GEOMETRY('ST_Point',AVG(x),AVG(y)) AS p
FROM jh_time_tests2 a
LEFT JOIN jh_time_ref b ON b.hours=CAST(EXTRACT(HOUR FROM open_ts) AS INT)
) AS dt
ORDER BY distance

Btw, plotting your 24 hours results in sinus/cosinus curves, so you could replace the join to jh_time_ref with some calculations :-)

SELECT TOP 1 tr.hours, NEW ST_GEOMETRY('ST_Point',tr.x,tr.y).ST_Distance(p) AS distance
FROM jh_time_ref AS tr
CROSS JOIN
(
SELECT NEW ST_GEOMETRY('ST_Point',AVG(SIN(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5 )
,AVG(COS(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5)) AS p
FROM jh_time_tests2 a
) AS dt
ORDER BY distance

With SIN/COS you could include minutes and seconds in your calculation, too...


Enthusiast

Re: Find average time of day

Hi Dieter,

Your second code snippet is very helpful.

I've added minutes and it's coming back with the expected results.

In regards to the distance, I didn't explain that well.

I'm not looking for the distance between two given points. I'm looking for the point at which a line intersects with the circle given the start point as being the center of the circle and the second point being the average or all ther other points.

So, if I have a timestamp with 22:00:00 (-.5,.87), a second with 02:00:00(.5,.87). Then the midpoint will be (0,.87). However, this does not plot with a point on the circle. In this example I know the point should be (0,1), but is there a formula which gives this intersection?

Anyway, I'm not expecting you to answer this, as your points above have been very helpful!

Thanks a lot Dieter, you're a legend! 

Junior Contributor

Re: Find average time of day

Hi John,

the distance between two points can easily be calculated using Pythagoras' theorem, no need for geospatial :-)

SELECT
tr.hours
,SQRT(((tr.x-dt.x)**2) + ((tr.y-dt.y)**2)) AS distance
FROM jh_time_ref AS tr
CROSS JOIN
(
SELECT
AVG(SIN(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5) AS x
,AVG(COS(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5) AS y
FROM jh_time_tests2 a
) AS dt
QUALIFY ROW_NUMBER() OVER (ORDER BY distance) = 1

Re: Find average time of day

Dieter,

I have the same original question and I tried your code but it's not giving me correct answer. When I have two hour values as 19 and 5, it is giving me varied values,19 and 20 (for two different users) as average time of day. Whereas I am expecting to get 0 for midnight.

 

 

Highlighted
Junior Contributor

Re: Find average time of day

Times without time zone follow strange rules :-)

 

The sessions of those users use different time zones, e.g. -5 to get 19:00.

 

It seems to work when you apply UTC:

Avg((opened_at  AT 0) - TIME '12:00:00+00:00' HOUR TO SECOND(0)
   ) + TIME '00:00:00'

 

Re: Find average time of day

Hi Dieter,

Thanks for your reply. Applying the formula on UTC timestamp column. Is there a formula that will calculate the avg of PT timestamp values? Here is my premises- want to have open times in PT and avg open hour of day in PT. What I did so far- converted UT open times to PT (accounted for daylight savings), extracted hour value from PT timestamp, avg of hour values and convert to integer. But, the problem is average of 19 hour and 5 hour is 12 instead of 0 (24 hr format). 

 

Limitations of calculating avg in UTC- gives the average in a 12 hr format (I want 24 hr) and to convert the UT avg value in PT, due to daylight savings, we won't know whether to do -7 or -8 from avg value. So, its more complicated to solve it this way.

Thanks!

Junior Contributor

Re: Find average time of day

UTC doesn't use 12h format, it's always 24h (only when casting to a string there's AM/PM)

 

What's the exact data type of that column, TIME or TIMESTAMP? Including Time zone?

 

Can you show some actual data and expected result?