joining table A to table B only once where condition satisfied

UDA
Enthusiast

joining table A to table B only once where condition satisfied

Hello,

Please help where possible. I have been asked to determine if this work can be done in teradata, otherwise the work goes to another team here at work. None of us here at work know of a way but we are not experts. :)

Please read my example below along with my desciption. I have tried to be as clear as possible but will clarify if needed.

I am trying to track the sale history for people. In the example below someone buys a quantity of 100, then buys another 100, then sells 100, then sells 100.

What I want to do is track the 'SELL' history and find out which 'BUY' it relates to. The concept here is first 'SELL' out was the first 'BUY' in.

In this case the first 'SELL' of 100 relates to the first 'BUY' of 100. The next 'SELL' of 100 relates to the second 'BUY' of 100.

Please see the example of the BUY and SELL tables I have.

Keeping it simple for now - If I join TABLE B to TABLE A on CLIENT ID and AMOUNT, I will get 4 records.

What I am attempting to do is produce only two rows - The idea being that once the first SELL record joins to the first BUY, that first BUY record can no longer be found when the second SELL attempts to join on CLIENT ID and AMOUNT. It should find the second BUY record.

Example

I have TABLE A for a single CLIENT ID. The table has two records, four fields in each record.

CLIENT ID = 1, AMOUNT = 100, MOVEMENT = 'BUY', DATE = 01/01/2000
CLIENT ID = 1, AMOUNT = 100, MOVEMENT = 'BUY', DATE = 01/02/2000

I have TABLE B for a single CLIENT ID. The table has two records, four fields in each record.
CLIENT ID = 1, AMOUNT = 100, MOVEMENT = 'SELL', DATE = 01/03/2000
CLIENT ID = 1, AMOUNT = 100, MOVEMENT = 'SELL', DATE = 01/04/2000

Expected output is two rows returning everything from TABLE B and two fields, MOVEMENT and DATE from TABLE A
CLIENT ID = 1, AMOUNT = 100, MOVEMENT = 'SELL', DATE = 01/03/2000, MOVEMENT = 'BUY', DATE = 01/01/2000
CLIENT ID = 1, AMOUNT = 100,MOVEMENT = 'SELL', DATE = 01/04/2000, MOVEMENT = 'BUY', DATE = 01/02/2000

This is a simple example. Often the AMOUNTS are not the same and I have to look at remainders etc but this is a starting point to figure out simply if this is possible.

Thank you in advance for any assistance and direction.

Chris.

16 REPLIES
Enthusiast

Re: joining table A to table B only once where condition satisfied

its simple....rank the sales and buys by date and capture the value of the rank and join on the rank along with item and qty......
Enthusiast

Re: joining table A to table B only once where condition satisfied

Ok that's a way I hadn't considered. I will think about solving it this way. My first thoughts however is that this will not work if I complicate it by saying that the quantity doesn't always line up.

A person can BUY an amount of 100, BUY another 50, then SELL 75, then SELL 30. The SELL of 75 needs to come from the first BUY of 100 leaving a remainder of 25. The next sell needs to use the remainder of 25 along with 25 from the second sell.

If that makes sense :)

Thanks very much for your help though. I will investigate.
Enthusiast

Re: joining table A to table B only once where condition satisfied

Update

I have been told by other people who know more about SQL than me that what I am suggesting is not really possible without using a great deal of time and resources on our system. Teradata doesn't handle my situation as well as other programs.

I was investigating using stored procedures and using a loop to update the table to calculate a remainder but I don't think it will work.

I suppose my final question is this - can you join two tables (table A and table B) together and at the same time update table B so that each time table A joins to table B, the updated table B is used in the join rather than the original table B?

Thanks in advance

Junior Contributor

Re: joining table A to table B only once where condition satisfied

Hi Chris,
this is a kind of FIFO queue, which can be written using some OLAP plus OVERLAPS:

SELECT *
FROM
(
SELECT client_id, amount, date_col, movement,
SUM(amount) OVER (ORDER BY date_col ROWS UNBOUNDED PRECEDING) AS e1, -- end amount
e1 - amount AS s1 -- start amount
FROM tab
WHERE movement = 'BUY'
) AS buy
JOIN -- maybe outer join?
(
SELECT client_id, amount, date_col, movement,
SUM(amount) OVER (ORDER BY date_col ROWS UNBOUNDED PRECEDING) AS e2, -- end amount
e2 - amount AS s2 -- start amount
FROM tab
WHERE movement = 'SELL'
) AS sell
ON buy.client_id = sell.client_id
AND -- check for overlapping ranges
(
(S1 >= S2 AND NOT (S1 >= E2 AND E1 >= E2))
OR
(S2 > S1 AND NOT (S2 >= E1 AND E2 >= E1))
)

So easy to write but might use a lot of CPU depending on the number of rows per client_id.

Dieter

Dieter
Enthusiast

Re: joining table A to table B only once where condition satisfied

Hmm Thanks for the ongoing support on this issue Dieter

I looked at implementing your suggestion (perhaps I am missing something) but I am not sure it holds for remainders. Your idea of FIFO is clever (and something I now may use for future work) but using my remainder example above, if I buy 100 and sell 75 then the next sell of 30 needs to refer to the 25 remainder from the first buy and 5 from the second buy.

As I understand things when I run your query (or any suggestions I have come up with myself) the tables are locked, queried, then released. So the UNBOUNDED window you are creating would write out all results that satisfy the conditions on the locked table. This means only the first row in my answerset would be correct. After that I would have to run the query again and ensure that the results from the first row are taken into consideration when creating the second row.

Am I correct? If so, can I use a stored procedure to create a loop and then somehow insert into a table using loops and possibly even your OLAP suggestion?

Thanks so much for the suggestion. I look forward to your response.

Chris

Junior Contributor

Re: joining table A to table B only once where condition satisfied

Hi Chris,
hopefully this query returns exactly what you want, including reminders.
It creates all requested combinations of sells/buys.
Just try it.

But please add "PARTITON BY client_id" to each of the OLAP-SUMs (i forgot that).

Dieter
Enthusiast

Re: joining table A to table B only once where condition satisfied

Thanks again Dieter.

I have tested the code above and I can see the output and am still trying to work out its application to my problem. Can you explain the following step please as I do not understand what you are looking for here?

===============
AND -- check for overlapping ranges
(
(S1 >= S2 AND NOT (S1 >= E2 AND E1 >= E2))
OR
(S2 > S1 AND NOT (S2 >= E1 AND E2 >= E1))
)
===============

Also in my earlier examples I should have stated my complete problem but I was trying to keep it simple. I shouldn't have though because I don't think the solutions given will work. Sorry for that and sorry for wasting your time. Next time I will post using the complete complex problem rather than assuming you don't need it all to solve the problem hahaha.

I have explained my complete problem below. Again I think this challenge is a little difficult but could be a fun test for the better SQL programmers in here. I am looking at stored procedures using a loop of some sort (but am not experienced at the application of them). I tried using recursive statements but since discovered that you cannot use derived tables etc in these. I couldn't get it working using a recursive statement without them so that idea didn't work.
-------------------

I am trying to track the SALE history for people and calculate the value of each SALE . In the example below someone buys a quantity of 100, then buys another 40, then sells 75, then sells 30. The BUYs and SELLs are in one table but I break them into two tables as I think this is clearer but it doesn't have to be.

I have TABLE A for a single CLIENT ID. The table has two records, 5 fields in each record.
1. CLIENT ID = 1, AMOUNT = 100, MOVEMENT = 'BUY', DATE = 01/01/2000, PRICE = $15
2. CLIENT ID = 1, AMOUNT = 40, MOVEMENT = 'BUY', DATE = 01/02/2000, PRICE = $30

I have TABLE B for a single CLIENT ID. The table has two records, 5 fields in each record.
1. CLIENT ID = 1, AMOUNT = 75, MOVEMENT = 'SELL', DATE = 01/03/2000, PRICE = $20
2. CLIENT ID = 1, AMOUNT = 30, MOVEMENT = 'SELL', DATE = 01/04/2000, PRICE = $55

What I want to do is track the 'SELL' history and find out which 'BUY' it relates to and the value of each SALE. The idea is that my output is DRIVEN by the existance of a SELL record.

Below are the conditions to the creation of an output record

1. To find out which BUY I join to I need to sort the BUYS by PRICE descending. So in my example the first SELL relates to the SECOND BUY as the second BUY has the highest PRICE of all BUYS for this CLIENT ID.

2. If there is a remainder when substracting a SELL amount from a BUY amount then I need the value of the remainder calculated too.

3. Also the SALE date has to occur after a BUY date. (i.e. SALE date >= BUY date)
There are some cases where the transaction history in date order could be BUY, SELL, SELL, BUY, SELL, SELL.
When querying the first SELL it cannot look at the second BUY when producing a result as the BUY date was after the SELL date. Similarly the third SELL cannot look at the second BUY when producing a result as the BUY date was after the SELL date.

EXAMPLE
1. I look at the first SELL (chosen in ascending DATE order) which is an amount = 75

2. I want to find the BUY that the SELL relates to. I do this by selecting the BUY with the highest PRICE which is BUY record number 2 as it has a PRICE = $30 and an amount = 40

3. So I join first SELL (75) to second BUY (40) but the amount of the BUY is only (40) so I still need to work out what to do with the remaining SELL (35) that the second BUY does not cover. I believe that the easiest way to resolve this is to make a new record with the remaining SELL (35) as a new record (with the same CLIENT ID, MOVEMENT, DATE and PRICE information as the SELL record it came from). Unless you can suggest something else!

4. Since there is a remainder I need to resolve this SELL BEFORE I move onto SELL number 2.

5. Thefore for remainding SELL (35) I want to find the BUY that the SELL relates to. I do this by selecting the BUY with the highest PRICE that HAS NOT already been "USED UP" by PREVIOUS SELLS. Given that BUY record number 2 has already been used up, I need to join to the next BUY record (sorted by PRICE descending). This should give me BUY record 1 as it has a PRICE = $15 and an amount = 100.

6. That resolves the remainder for the SELL. Now I have a remainder for the BUY of 65. If there are no other SELLS then I can stop as this work is all driven by the existance of SELLS. However in my example there is still SELL number 2 from my original table above which has not yet been queried.

7. I look at this next SELL (chosen in ascending DATE order) which is an amount = 30.

8. For this SELL (30) I want to find the BUY that the SELL relates to. I do this by selecting the BUY with the highest PRICE that HAS NOT already been "USED UP" by PREVIOUS SELLS. BUY record 1 and 2 have both been used but there is a BUY record with a REMAINDER to use up with an amount = 65. I believe that the easiest way to resolve this is to make a new record with the remaining BUY (65) as a new record (with the same CLIENT ID, MOVEMENT, DATE and PRICE information as the BUY record it came from). Unless you can suggest something else!

9. The SELL(30) therefore joins to BUY REMAINDER (65). This still leaves a BUY REMAINDER (65) but there are no more SELL records for this CLIENT therefore the query can end.

--------

Using my idea of creating a new record whenever there is a remainder, a would expect the following output. This may of course change if you can think of a better method.

Expected output is three rows returning everything from both the BUY and SELL tables.

1. CLIENT ID = 1, SELL_AMOUNT = 75, SELL_MOVEMENT = 'SELL', SELL_DATE = 01/03/2000, SELL_PRICE = $20
BUY_MOVEMENT = 'BUY', BUY_DATE = 01/02/2000, BUY_AMOUNT = 40, BUY_PRICE = $30

2.CLIENT ID = 1, SELL_AMOUNT = 35, SELL_MOVEMENT = 'SELL', SELL_DATE = 01/03/2000, SELL_PRICE = $20
BUY_MOVEMENT = 'BUY', BUY_DATE = 01/01/2000, BUY_AMOUNT = 100, BUY_PRICE = $15

3.CLIENT ID = 1, SELL_AMOUNT = 30, SELL_MOVEMENT = 'SELL', SELL_DATE = 01/04/2000, SELL_PRICE = $55
BUY_MOVEMENT = 'BUY', BUY_DATE = 01/01/2000, BUY_AMOUNT = 65, BUY_PRICE = $15
----------

So that's it. I think I have included all steps now. I really wish to solve this but am at my skill limit. Thanks in advance to anyone who can help.

Thanks

Chris.

Enthusiast

Re: joining table A to table B only once where condition satisfied

For those that are going to attempt it I thought I would give some tables to help test their theories! :)

CREATE VOLATILE TABLE BUY
,NO FALLBACK
,CHECKSUM = DEFAULT
,NO LOG
(
BUY_CLIENTID INTEGER
,BUY_AMOUNT INTEGER
,BUY_MOVEMENT CHAR(4)
,BUY_DATE DATE FORMAT 'YY/MM/DD'
,BUY_PRICE DECIMAL(10,2)
)
PRIMARY INDEX (CLIENTID)
ON COMMIT PRESERVE ROWS
;

INSERT INTO BUY
VALUES
(
1
,100
,'BUY'
,'2000-01-01'
,15.00
)
;

INSERT INTO BUY
VALUES
(
1
,40
,'BUY'
,'2000-02-01'
,30.00
)
;

CREATE VOLATILE TABLE SELL
,NO FALLBACK
,CHECKSUM = DEFAULT
,NO LOG
(
SELL_CLIENTID INTEGER
,SELL_AMOUNT INTEGER
,SELL_MOVEMENT CHAR(4)
,SELL_DATE DATE FORMAT 'YY/MM/DD'
,SELL_PRICE DECIMAL(10,2)
)
PRIMARY INDEX (CLIENTID)
ON COMMIT PRESERVE ROWS
;

INSERT INTO SELL
VALUES
(
1
,75
,'SELL'
,'2000-03-01'
,20.00
)
;

INSERT INTO SELL
VALUES
(
1
,30
,'SELL'
,'2000-04-01'
,55.00
)
;
Junior Contributor

Re: joining table A to table B only once where condition satisfied

Hi Chris,

"Can you explain the following step please as I do not understand what you are looking for here?

===============
AND -- check for overlapping ranges
(
(S1 >= S2 AND NOT (S1 >= E2 AND E1 >= E2))
OR
(S2 > S1 AND NOT (S2 >= E1 AND E2 >= E1))
)
==============="

As i wrote: i check if the previoulsy calculated cumulated start and end amounts of buy and sell overlap.
Run each Derived Table separately and check the output.

"3. Also the SALE date has to occur after a BUY date. (i.e. SALE date >= BUY date)
There are some cases where the transaction history in date order could be BUY, SELL, SELL, BUY, SELL, SELL.
When querying the first SELL it cannot look at the second BUY when producing a result as the BUY date was after the SELL date. Similarly the third SELL cannot look at the second BUY when producing a result as the BUY date was after the SELL date."

My query solves your original post describing a FIFO if you can't "sell" more than you "bought".

"1. To find out which BUY I join to I need to sort the BUYS by PRICE descending."

This could be solved by changing "ORDER BY date_col" in the BUY to "ORDER BY price DESC", but the combination of 1 and 3 can't be done in an easy way as it's no FIFO any more, it's a "HPIBADFO" "highest price in before a date, first out" :-)

So no matter if it's solved using a complex query or a cursor in a Stored Procedure, there's one main question now (mainly regarding performance):
How many rows will be used as input and how many rows per cust_id exist as maximum/average?

Dieter