Determine status on a given date

Database
Enthusiast

Determine status on a given date

I have a table (status_moves) which logs each time a record changes status. It looks like this:

[db.stage_moves]







_date account_id old_value new_value
5/21/2014 400 B C
5/31/2014 400 C A
6/1/2014 300 C B
6/4/2014 400 A C
7/1/2014 300 B A

I need to write a query to determine what stage a a particular account_id was in on a particular date. For example,

For the question :"In what status was Account_id 400 on June 21, 2014?"

The Answer should be "C", as the most recent "new_value" prior to June 21 was "C".

How can I write a query to get the end result to look like this:






_date account_id account_status_on_date
6/21/2014 400 C

I believe I need to use max or min function but am unsure of where to start.

8 REPLIES
Enthusiast

Re: Determine status on a given date

This is the format of the [db.stage_moves] table:

_date             account_id        old_value        new_value

5/21/2014          400               B                    C

5/31/2014          400               C                    A

6/01/2014          300               C                    B

6/04/2014          400               A                    C

7/01/2014          300               B                    A

 

Enthusiast

Re: Determine status on a given date

And this is how I would want to see the result to determine status on June 21, 2014:

_date            account_id          account_status_on_date

6/21/2014          400                        C 

Enthusiast

Re: Determine status on a given date

I used the following to recreate your example:

CREATE VOLATILE TABLE vt_stage_moves
, NO FALLBACK
, NO JOURNAL
(
_date DATE,
account_id INTEGER,
oldValue CHAR(1),
newValue CHAR(1)
) PRIMARY INDEX (_date,account_id)
ON COMMIT PRESERVE ROWS;

insert into vt_stage_moves values('2014-05-21',400,'B','C');
insert into vt_stage_moves values('2014-05-31',400,'C','A');
insert into vt_stage_moves values('2014-06-01',300,'C','B');
insert into vt_stage_moves values('2014-06-04',400,'A','C');
insert into vt_stage_moves values('2014-07-01',300,'B','A');

I would personally feel most comfortable doing this with Ordered Analytic Functions.

SELECT '2014-06-21' AS _date,account_id,newValue AS account_status_on_date
FROM vt_stage_moves st
WHERE st._date <= '2014-06-21'
AND st.account_id = 400
QUALIFY ROW_NUMBER() OVER (ORDER BY st._date DESC) = 1;

Now... you'll note that I had to manually hard code the date... this query would work best by creating a macro for it, where you can just use variables.

The basic idea of the query is:

1. Pick all of the dates less than the specified date.

2. Pick the highest date out of the dates from (1).

ROW_NUMBER() simply assigns a number to the row based on the ORDER BY in the OVER clause. QUALIFY is the clause for filtering based on ordered analytic functions (much like you use the HAVING clause for aggregation functions). There are MANY other ways to use ordered analytic functions... this is just one use case.

Enthusiast

Re: Determine status on a given date

Thank you cmedved! This makes a ton of sense. I also agree with your logic regarding the macro.

Moreover, my bigger goal is to create a table which has all dates up to today, the account_id, and the stage on that day.

I'm going to combine the technique you've suggested in conjunction with a join I am doing to a dates table and will post my findings/results. 

Appreciate your help!

Junior Contributor

Re: Determine status on a given date

Instead of a join you might use EXPAND on to create the missing dates, see

http://forums.teradata.com/forum/general/creating-missing-observations

Enthusiast

Re: Determine status on a given date

Thanks Dieter - 

I am interested in using the EXPAND method but am not sure exactly how to apply this as the prior example seems to be focused on dates, and I am trying to populate the most recent value of  a VARCHAR(255) field for 'account_status_on_date' .

Tried doing this but got nowhere so far as it gave me a  "Expected something between the Select keyword and the 'Key' Keyword.

SELECT Key

   _date

   ,account_id

   BEGIN(sod2) as status_on_date

FROM

 (

   SELECT

      tab.*,

      PERIOD(status_on_date, COALESCE(MIN(_date) 

                                   OVER (PARTITION BY Key

                                         ORDER BY _date

                                         ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING), _date + 1)) AS sod

   FROM table as tab

 ) AS dt

EXPAND ON sod AS sod2

Any guidance on how I might change the query?

Junior Contributor

Re: Determine status on a given date

The error message tells exactly what's wrong:

KEY is a keyword, if there's actually a column named key in your table you must use double quotes: "key"

I don't get what you want to achieve, could you add some data and expected result?

Enthusiast

Re: Determine status on a given date

Sorry I was extremely unclear. Please ignore my terrible attempt and hacking that query together.

I'll try to explain better:

I have a table (db.stage_moves) which logs the dates and changes of "account_status" field whenver there is a change. 

Accounts can move from any status to any other status, and they can stay in a status for any amount of time. the table db.stage_moves is set up as such:

                    _date               account_id          old_stage          new_stage

                    5/21/2014          452                    ?                    C

                    5/23/2014          300                    C                    A

                    5/27/2014          300                    A                    B

                    5/29/2014          452                    C                    A

FYI about this table(db.stage_moves):

-the above example just shows two values for "account_id", but there are MANY account_ids in this table.

-There is never more than one stage move per account on a given date.

-every account at least has one record from when it was created and the inital stage was set.

-"new_stage" is effectively the "stage" for the given "account_id" on that date (we're considering the stage on a given date as the end of day stage)

-There is a record every date when there is a move on any account, but if there is no change for an on a specific date, then there will not be a record for that account_id and that date.

&nbsp"

The result I want is a table (db.account_stage_by_date )set up as such:

-Includes a record for all dates from January 1, 2013 to current_date for each account ID in the table. (so long as the date is not prior to the initial stage move for that account, for example, if the intial stage move for account_id 452 was on 5/21/2014, then there should not be a record for that account on dates prior to 5/21/2014, but there should be a record showing stage_on_date for this account_id every date>= 5/21/2014

so i need to fill in missing "_date" fields as well as "stage_on_date" for every account_id present in the table

"stage_on_date" should pull from the most recent "new_stage" value where the stage was last moved.

Here's a snapshot of what the result table(db.account_stages_by_date) would look like:

                    _date               account_id          old_stage          new_stage     stage_on_date     stage_moved_on_date

                    5/21/2014          452                    ?                    C               C                    1

                    5/21/2014          300                    ?                    ?               C                    0

                    5/22/2014          452                    ?                    ?               C                    0

                    5/22/2014          300                    ?                    ?               C                    0

                    5/23/2014          452                    ?                    ?               C                    0

                    5/23/2014          300                    C                    A               A                    1

                    5/24/2014          452                    ?                    ?               C                    0

                    5/24/2014          300                    ?                    ?               A                    0

                    5/25/2014          452                    ?                    ?               C                    0

                    5/25/2014          300                    ?                    ?               A                    0

                    5/26/2014          452                    ?                    ?               C                    0

                    5/26/2014          300                    ?                    ?               A                    0

                    5/27/2014          452                    ?                    ?               C                    0

                    5/27/2014          300                    A                    B               B                    1

                    5/28/2014          452                    ?                    ?               C                    0

                    5/28/2014          300                    ?                    ?               B                    0

                    5/29/2014          452                    C                    A               A                    1

                    5/29/2014          300                    ?                    ?               B                    0

                    5/30/2014          452                    ?                    ?               A                    0

                    5/30/2014          300                    ?                    ?               B                    0

                    5/31/2014          452                    ?                    ?               A                    0

                    5/31/2014          300                    ?                    ?               B                    0

Let me know if this makes sense.