Analytics

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

04-25-2015
06:34 AM

04-25-2015
06:34 AM

I have code with hours of day. I want to find the average of day but if one time is hour 22 another is 2 then the average with be 12 noon but I want to get 0 midnight as it is the closest time between the two hours. How do I find the average and take midnight wrap around into account???

select purchase,

`CAST(AVG(EXTRACT(HOUR FROM opened_at)) AS INT) as average_open`

from db.purchasetable

group by 1

11 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

04-25-2015
10:26 AM

04-25-2015
10:26 AM

You can't do an average on *time*, but it's allowed for *interval* :-)

Subtracting two times return an interval:

AVG(CAST(opened_at AS TIME(0)) -- extract only the time portion

- TIME '12:00:00' HOUR TO SECOND(0)) -- subtract 12 hours to center at midnight (returns an interval)

+ TIME '00:00:00' -- change it back to a time

If you still want an hour as result you can simply extract it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

04-25-2015
05:34 PM

04-25-2015
05:34 PM

Hi Dieter,

I've been looking into this more and I have a workaround which involves a time reference table.

There are points in a plot which map out 1-24 hours in a circle.

The x and y values give the mapping for the plot.

This is the insert for that mapping:

insert into sandbox.jh_time_ref values (18,0,0.5);

insert into sandbox.jh_time_ref values (17,0.0238095238095238,0.357142857142857);

insert into sandbox.jh_time_ref values (16,0.0714285714285714,0.238095238095238);

insert into sandbox.jh_time_ref values (15,0.142857142857143,0.142857142857143);

insert into sandbox.jh_time_ref values (14,0.238095238095238,0.0714285714285714);

insert into sandbox.jh_time_ref values (13,0.357142857142857,0.0238095238095238);

insert into sandbox.jh_time_ref values (12,0.5,0);

insert into sandbox.jh_time_ref values (11,0.642857142857143,0.0238095238095238);

insert into sandbox.jh_time_ref values (10,0.761904761904762,0.0714285714285714);

insert into sandbox.jh_time_ref values (9,0.857142857142857,0.142857142857143);

insert into sandbox.jh_time_ref values (8,0.928571428571428,0.238095238095238);

insert into sandbox.jh_time_ref values (7,0.976190476190476,0.357142857142857);

insert into sandbox.jh_time_ref values (6,1,0.5);

insert into sandbox.jh_time_ref values (5,0.976190476190476,0.642857142857143);

insert into sandbox.jh_time_ref values (4,0.928571428571428,0.761904761904762);

insert into sandbox.jh_time_ref values (3,0.857142857142857,0.857142857142857);

insert into sandbox.jh_time_ref values (2,0.761904761904762,0.928571428571428);

insert into sandbox.jh_time_ref values (1,0.642857142857143,0.976190476190476);

insert into sandbox.jh_time_ref values (0,0.5,1);

insert into sandbox.jh_time_ref values (23,0.357142857142857,0.976190476190476);

insert into sandbox.jh_time_ref values (22,0.238095238095238,0.928571428571428);

insert into sandbox.jh_time_ref values (21,0.142857142857143,0.857142857142857);

insert into sandbox.jh_time_ref values (20,0.0714285714285714,0.761904761904762);

insert into sandbox.jh_time_ref values (19,0.0238095238095238,0.642857142857143);

Then I reference this table like so:

select avg(x),avg(y)

from sandbox.jh_time_tests2 a

left join sandbox.jh_time_ref b on b.hours=CAST(EXTRACT(HOUR FROM open_ts) AS INT)

The test table is simply two inserts in a table:

insert into sandbox.jh_time_tests2 values ('2015-04-26 22:00:00','2015-04-26');

insert into sandbox.jh_time_tests2 values ('2015-04-26 02:00:00','2015-04-26');

The result of the averaging gives

Average(x) Average(y)

0.5 0.93

When plotting this additional point on the time reference plot it is exactly where I want it to be.

Although, from the back of this I have another question.

Is there a way of finding the closest point from this new point to the points in the reference table?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

04-26-2015
04:12 AM

04-26-2015
04:12 AM

Hi John,

Getting the closest point is a bit tricky, how do you define "closest"?

What result should be returned for hours 10 and 14 (or 6 and 18), Midnight or noon?

What's your TD release, are the geospatial extensions available?

SELECT TOP 1 tr.hours, NEW ST_GEOMETRY('ST_Point',tr.x,tr.y).ST_Distance(p) AS distance

FROM jh_time_ref AS tr

CROSS JOIN

(

SELECT NEW ST_GEOMETRY('ST_Point',AVG(x),AVG(y)) AS p

FROM jh_time_tests2 a

LEFT JOIN jh_time_ref b ON b.hours=CAST(EXTRACT(HOUR FROM open_ts) AS INT)

) AS dt

ORDER BY distance

Btw, plotting your 24 hours results in sinus/cosinus curves, so you could replace the join to jh_time_ref with some calculations :-)

SELECT TOP 1 tr.hours, NEW ST_GEOMETRY('ST_Point',tr.x,tr.y).ST_Distance(p) AS distance

FROM jh_time_ref AS tr

CROSS JOIN

(

SELECT NEW ST_GEOMETRY('ST_Point',AVG(SIN(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5 )

,AVG(COS(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5)) AS p

FROM jh_time_tests2 a

) AS dt

ORDER BY distance

With SIN/COS you could include minutes and seconds in your calculation, too...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

04-26-2015
04:18 PM

04-26-2015
04:18 PM

Hi Dieter,

Your second code snippet is very helpful.

I've added minutes and it's coming back with the expected results.

In regards to the distance, I didn't explain that well.

I'm not looking for the distance between two given points. I'm looking for the point at which a line intersects with the circle given the start point as being the center of the circle and the second point being the average or all ther other points.

So, if I have a timestamp with 22:00:00 (-.5,.87), a second with 02:00:00(.5,.87). Then the midpoint will be (0,.87). However, this does not plot with a point on the circle. In this example I know the point should be (0,1), but is there a formula which gives this intersection?

Anyway, I'm not expecting you to answer this, as your points above have been very helpful!

Thanks a lot Dieter, you're a legend!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

04-28-2015
02:39 PM

04-28-2015
02:39 PM

Hi John,

the distance between two points can easily be calculated using Pythagoras' theorem, no need for geospatial :-)

SELECT

tr.hours

,SQRT(((tr.x-dt.x)**2) + ((tr.y-dt.y)**2)) AS distance

FROM jh_time_ref AS tr

CROSS JOIN

(

SELECT

AVG(SIN(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5) AS x

,AVG(COS(RADIANS(EXTRACT(HOUR FROM open_ts) *15)) * 0.5 + 0.5) AS y

FROM jh_time_tests2 a

) AS dt

QUALIFY ROW_NUMBER() OVER (ORDER BY distance) = 1

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

3 weeks ago

Dieter,

I have the same original question and I tried your code but it's not giving me correct answer. When I have two hour values as 19 and 5, it is giving me varied values,19 and 20 (for two different users) as average time of day. Whereas I am expecting to get 0 for midnight.

Highlighted
##
##### Re: Find average time of day

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

3 weeks ago

Times without time zone follow strange rules :-)

The sessions of those users use different time zones, e.g. -5 to get 19:00.

It seems to work when you apply UTC:

Avg((opened_at AT 0) - TIME '12:00:00+00:00' HOUR TO SECOND(0) ) + TIME '00:00:00'

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

3 weeks ago

Hi Dieter,

Thanks for your reply. Applying the formula on UTC timestamp column. Is there a formula that will calculate the avg of PT timestamp values? Here is my premises- want to have open times in PT and avg open hour of day in PT. What I did so far- converted UT open times to PT (accounted for daylight savings), extracted hour value from PT timestamp, avg of hour values and convert to integer. But, the problem is average of 19 hour and 5 hour is 12 instead of 0 (24 hr format).

Limitations of calculating avg in UTC- gives the average in a 12 hr format (I want 24 hr) and to convert the UT avg value in PT, due to daylight savings, we won't know whether to do -7 or -8 from avg value. So, its more complicated to solve it this way.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

3 weeks ago

3 weeks ago

UTC doesn't use 12h format, it's always 24h (only when casting to a string there's AM/PM)

What's the exact data type of that column, TIME or TIMESTAMP? Including Time zone?

Can you show some actual data and expected result?