First time posting on the forum but a long time searcher!! I'm having some problems writing efficient code to complete what I think should be a relatively simple scenario. I have a table with a Name and then a Timestamp column as a Contact_DateTime. For every occurence of a Name and a Contact Date I want to a third column showing the number of Contact_DateTime appears within 7 days of the original Contact_DateTime excluding the original Contact_DateTime
Sample Contact table
Desired Query Output
My first attempt of the code is below which does works but then I have the problem of aggregating the data to get the desired output. If I waited the estimated time to run I thinK I would have aged 2 years!
Select Contact_ALL.Name, Contact_ALL.DateTime, Contact_7.DateTime as OtherDateTime, (Contact_ALL.DateTime + Interval '7' Day) as DateTime7,
When Contact_7.Contact_DateTime= Contact_ALL.Contact_DateTime then 0
When Contact_7.Contact_DateTime Between Contact_ALL.Contact_DateTime and DateTime7 Then 1
0 End as Contact_7Days
From Database.Contact Contact_ALL
Inner Join (Select Contact.Name, Contact.Contact_DateTime from Database.Contact) Contact_7
Order by 1,2,3
Apologies if I'm not following any structure but I'm happy to listen to suggestions and improvements. I'll try respond in the correct manner too.
Thanks in advance
some modification of your original query should give the correct output, but the efficiency depends on the number of rows per name, it's fast if the number is low:
SELECT Contact_ALL.Name, Contact_ALL.DateTime,
FROM Contact Contact_ALL
LEFT JOIN Contact AS Contact_7
ON Contact_ALL.Name= Contact_7.Name
AND Contact_7.DateTime < Contact_ALL.DateTime
AND Contact_7.DateTime >= Contact_ALL.DateTime - INTERVAL '7'DAY
GROUP BY 1,2
ORDER BY 1,2
Could you provide more details?
Average/maximum number of rows per user?
Do you need exact 7 days based on the timestamp or would it be enough to do it based on on a cast to date?
What is the range of dates within that table?
Thank you for your reply.
The table I'm using contains over 7 million names and contact dates and times for all contact between April 2012 and May 2012. Average rows per user is 3 and the maximum number of rows within the April 2012 and May 2012 data is in to the thousands. I may have to link the Name to the Customer table in order because I'm only using a partial customer base but that still equates to nearly 4 million Names which should reduce this the maximum number of rows.
I would use 7 days based on the timestamp as it's used as a measure to drive efficiency so the accuracy is important.
My intention is to update each contact record with this count as a new column but it needs to be a repeatable exercise so I may look to create a new table.
I'll run the code you suggested and see how I get on, I appreciate your time on my query.