SQL problem - selecting an earlier date in a join

UDA
Enthusiast

SQL problem - selecting an earlier date in a join

Hi,

Anyone care to help with this please.

TABLE1

ClientId = 1
Date= 20080103

ClientId = 2
Date = 20080103

TABLE2
ClientId = 1
Date = 20080103
Value = $1.20

ClientId = 2
Date = 20080101
Value = $1.30

ClientId = 2
Date = 20080102
Value = $1.40

Now I wish to join TABLE1 to TABLE2 so that I can retrieve the value field in TABLE2 (plus the clientid and date fields from TABLE1)

Thats easy TABLE1.clientid = TABLE2.clientid and TABLE1.date = TABLE2.date. Based on this I will lose clientid 2 as the date fields cannot join but it will return clientid 1 which is correct

What I want to do is join to the TABLE2 based on the clientid and, if the date fields dont join, then return the maximum date possible where the TABLE2.date < TABLE1.date and give me the value for this date instead.

So the expected return is

clientid = 1
date = 20080103
value = $1.20

clientid = 2
date = 20080102
value = $1.40

Can this be done?

Cheers

Chris
15 REPLIES
Enthusiast

Re: SQL problem - selecting an earlier date in a join

try this:

sel T1.clientId, T1.Date as Date, T2.value

from Table1 T1,

Table2 T2

where T1.clientId = T2.clientId

and T1.Date = T2.Date

union

sel T1.clientId,

(case when

T1.Date > T2.date

Then T1.Date

Else T2.date

End) As Date,

T2.value

from Table1 T1,

Table2 T2

where T1.clientId = T2.clientId

let me know if this works for you.....

Regards
Enthusiast

Re: SQL problem - selecting an earlier date in a join

Hi, thanks for the reply

No this won't work.

Based on what you've suggested i would get.

table1.clientid = 1
table1.date = 20080103
table2.value = $1.20

table1.clientid = 2
table1.date = 20080103
value = $1.30

table1.clientid = 2
table1.date = 20080103
table2.value = $1.40

The union and case statement (for client = 2) would return the date from table1 each time.

I want the closest date from table2 and the value.

I expect to get one result back for each clientid/date combination

table1.clientid = 1
table1.date = 20080103
table2.date = 20080103
table2.value = $1.20

table1.clientid = 1
table1.date = 20080103
table2.date = 20080102
table2.value = $1.40

Thanks again for your assistance.

Teradata Employee

Re: SQL problem - selecting an earlier date in a join

hi

This query gives you the result as desired.

Select T1.ClientId,
T2.date,
Value
From Table1 T1
Inner join
Table2 T2
on T1.ClientId=T2.ClientId
where T2.date <= T1.date
qualify(row_number() over(partition by T1.ClientId order by T2.date desc))=1;

here i assumed that you need only 1 record from joining table for each ClientId.
This query yields matching record if availaible else the closest one (only if date is lesser than in Table1).

Let me know for further clarification.

Thanks
Ankit
Enthusiast

Re: SQL problem - selecting an earlier date in a join

Hi

Thanks for that,

I had something similiar already in that

sel
table3.clientid
,table3.max_date
, t2.value

from

(
Select
T1.ClientId
, max(T2.date) max_date
From Table1 T1
Inner join Table2 T2
on T1.ClientId=T2.ClientId
where T2.date <= T1.date
group by 1
)
table3

inner join Table2 T2
on table3.client = T2.clientid
and table3.max_date = T2.date
------------

the main problem with my query was that the number of dates for each client in both tables was in the hundreds of thousands and the number of client was millions. As such the query was spooling out.

In terms of efficiency - whats the difference between yours and mine? Is using a "rank/partition by" more efficient than "group by" when you are talking millions of records?
Teradata Employee

Re: SQL problem - selecting an earlier date in a join

Hi,

You can trace out reason for issue of spool space in your query by doing explain.
It shows that your query uses lot many spool files as it has to do two joins and sorts the data several time.

We can eliminate the same using qualify/row_number.

The major difference among two is that in case of partition by Aggregation is not done while it is done when using group by. So thereby it saves lot of processing and is faster.

Thanks
Ankit
Enthusiast

Re: SQL problem - selecting an earlier date in a join

Hi again

Thanks for the prompt replies on this.

I tried both methods and it spooled out due to the 20 million client/date records joining to 5 million date/value records (give or take).

I will work on breaking it down.

I made this thread to see if there was another way besides using a <= method of joining.

Thanks.

Re: SQL problem - selecting an earlier date in a join

>> CREATE TABLE Table1 (ClientID INT, [DATE] DATETime)

>> CREATE TABLE Table2 (ClientID INT, [DATE] DATETime, Value VarCHAR(10))

If table1 has the following data in it:

ClientID Date

1 2008-11-17 02:12:30.527

2 2008-11-18 02:12:30.527

If table2 has following data in it:

ClientID Date Value

1 2008-11-17 02:12:30.527 $1.20

2 2008-11-18 02:12:30.527 $1.30

3 2008-11-19 02:12:30.527 $1.40

SQL Query to your question:

SELECT T2.ClientID, T1.[Date], T2.Value
FROM Table1 T1
JOIN Table2 T2 ON T1.ClientID=t2.ClientID

********OR*******

SELECT T2.ClientID, T1.[Date], T2.Value
FROM Table1 T1
RIGHT JOIN Table2 T2 ON T1.ClientID=t2.ClientID AND T1.[Date]=T2.[Date]
Junior Contributor

Re: SQL problem - selecting an earlier date in a join

Hi Raghu,
in a Teradata forum you're supposed to post SQL valid/working on Teradata, but this is MS SQL Server syntax ;-)

Dieter
Enthusiast

Re: SQL problem - selecting an earlier date in a join

Hi,

The SQL you've provided will only return records where the clientID and date join. I need it where the date in table2 is equal to or less than the table1 date.

YOU PROVIDED THE FOLLOWING.

If table1 has the following data in it:

ClientID Date
1 2008-11-17 02:12:30.527
2 2008-11-18 02:12:30.527

If table2 has following data in it:
ClientID Date Value
1 2008-11-17 02:12:30.527 $1.20
2 2008-11-18 02:12:30.527 $1.30
3 2008-11-19 02:12:30.527 $1.40

--------------

SQL Query to your question:
SELECT T2.ClientID, T1.[Date], T2.Value
FROM Table1 T1
JOIN Table2 T2 ON T1.ClientID=t2.ClientID

(THIS ONLY RETURNS WHEN THE DATE FIELDS ARE EQUAL. SO YOUR QUERY WON'T WORK FOR ME)

********OR*******

SELECT T2.ClientID, T1.[Date], T2.Value
FROM Table1 T1
RIGHT JOIN Table2 T2 ON T1.ClientID=t2.ClientID AND T1.[Date]=T2.[Date]

(THIS RETURNS WHEN THE DATE FIELDS ARE EQUAL. THE RIGHT JOIN WILL MEAN LOTS OF RECORDS WITH NO DATE VALUES. SO YOUR QUERY WON'T WORK FOR ME.)

Thanks for your help but I don't believe you read my initial question correctly. Correct me if I am wrong though :)

USING YOUR EXAMPLES ABOVE - LET'S LEAVE TABLE1 THE SAME BUT MAKE TABLE2 LIKE BELOW.

TABLE2
ClientID Date Value
1 2008-11-15 12:12:12.527 $1.20
2 2008-11-18 02:12:30.527 $1.30
3 2008-11-19 02:12:30.527 $1.40

Cheers

Chris.