How to identify in-between steps based on timestamp

Analytics
Enthusiast

How to identify in-between steps based on timestamp

We have a list of steps with their 'processed datetime', so we can figure our their process sequence.

 

Now we want to indentify steps that start with '3010' only ('3010%') with all the steps prior to them, which is not difficult, just compare 'processd datatime'.  but how about if we want to identify the in-between prior steps for '3010' steps using teradata ql?

 

for example, assume '3010-1 step' is before '3010-2 step' (by comparing the process sequence), and we identify there are 6 steps prior to '3010-1 step', and 10 steps prior to '3010-2 step' including '3010-1 step' and its 6 prior steps. now we just want to indentify the steps between '3010-1 step' and '3010-2 step' and consider them as prior steps for '3010-2 step', so there are only 3 steps prior to '3010-2 step' (10-7=3). how can we achieve this thru queries? thanks 

 


Accepted Solutions
Junior Contributor

Re: How to identify in-between steps based on timestamp

There's no Group By for OLAP-functions, you need to add Partition By:

Over (PARTITION BY lotid ORDER BY RUNCOMPLETEDATETIME

1 ACCEPTED SOLUTION
12 REPLIES
Teradata Employee

Re: How to identify in-between steps based on timestamp

If you are just counting, you could use RESET WHEN. For example:

 

SELECT StepID, ProcessedDatetime, PriorStepsCount FROM

( SELECT StepID,

COUNT(*) OVER (ORDER BY ProcessedDatetime

RESET WHEN MAX(StepID) OVER (ORDER BY ProcessedDatetime ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) LIKE '3010%'

ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) as PriorStepsCount

) X

WHERE StepID LIKE '3010%'

Enthusiast

Re: How to identify in-between steps based on timestamp

thanks for reply Fred,

 

the intend is not for counting, but displaying in a certain structure. like:

step          Time sequence

1101-1         1

1102-2         2

1103-2         3

3010-1         4

2011-3         5

2011-2         6

3010-2         7

 

we want to display it as below

3010step  prior steps

3010-1    1101-1

3010-1    1102-2 

3010-1    1103-2 

3010-2    3010-1 

3010-2    2011-3

3010-2    2011-2 

how can we achieve this?

Enthusiast

Re: How to identify in-between steps based on timestamp

or maybe we can display it as below structure, just add 1 column for order based on original table, it also help:

step          Time sequence  order

1101-1         1                     1

1102-2         2                     1

1103-2         3                     1

3010-1         4                     2

2011-3         5                     3

2011-2         6                     3

3010-2         7                     4

 

any ideas? thanks very much

Enthusiast

Re: How to identify in-between steps based on timestamp

so basically, the question would be, how to based on the step sequece, add one column, and assign value 1 at start, when meet '3010%' step, then plus 1. kindly advice, thanks

         

Enthusiast

Re: How to identify in-between steps based on timestamp

so basically, the question would be, how to based on the step sequece, add one column, and assign value 1 at start, when meet '3010%' step, then plus 1. kindly advice, thanks

 

this is the raw queries for orgiginal data format:

 

select distinct a.STEPNAME, a.RUNCOMPLETEDATETIME from a

order by  a.STEPNAME, a.RUNCOMPLETEDATETIME;

 

we want to see the outputs like this format:

 

stepname     order

1101-1         1

1102-2         1

1103-2         1                  

3010-1         2                   

2011-3         3                   

2011-2         3                   

3010-2         4          

Junior Contributor

Re: How to identify in-between steps based on timestamp

Seems you need First/LastValue:

 

SELECT 
   -- assign the following 3010 step name to each row = assign the previous steps to each 3010 step
   First_Value(CASE WHEN a.STEPNAME LIKE '3010%' THEN stepname END IGNORE NULLS) -- only 3010 steps
   Over (ORDER BY RUNCOMPLETEDATETIME
         ROWS BETWEEN 1 Following AND Unbounded Following) AS "3010steps"        -- following 3010 step name
    ,a.STEPNAME, a.RUNCOMPLETEDATETIME
FROM a
QUALIFY "3010steps" IS NOT NULL  -- ?? don't kow if this is needed/wanted
ORDER BY a.RUNCOMPLETEDATETIME

 

 

Enthusiast

Re: How to identify in-between steps based on timestamp

thanks! this is for one of the proposal, and i confirm it works well. 

 

for the other output format, based on the step sequece, add one additional column named by order_number, and assign value 1 at start, when meet '3010%' step, then plus 1, for the next non-'3010%' steps, plus 1 and remain no change until meet the second '3010%' step, plus 1 again, till the end, . outputs like this format below, kindly advice, thank you

 

stepname     order number

1101-1         1

1102-2         1

1103-2         1                  

3010-1         2                   

2011-3         3                   

2011-2         3                   

3010-2         4     

Highlighted
Junior Contributor

Re: How to identify in-between steps based on timestamp

That's more complicated, you need to flag the first row of a group:

SELECT STEPNAME,
-- calculate the group number using a Cumulative Sum Sum(flag) Over (ORDER BY RUNCOMPLETEDATETIME ROWS Unbounded Preceding) +1 FROM ( SELECT a.STEPNAME, a.RUNCOMPLETEDATETIME,
-- flag the 1st row per group CASE WHEN Min(CASE WHEN stepname LIKE '3010%' THEN stepname end) Over (ORDER BY RUNCOMPLETEDATETIME ROWS 1 Preceding ) IS NOT NULL THEN 1 ELSE 0 END AS flag FROM a ) AS dt

 

 

Enthusiast

Re: How to identify in-between steps based on timestamp

that looks cool, and to confirm it works for me in some way, but another problem occur which is when I try to add group to all column, it will tell me  ' Failed [5481 : HY000] Ordered Analytical Functions not allowed in GROUP BY Clause.'  below is my full queries. you can see for the first column 'ID', each ID may have different sequece of steps, so I have to perform the group by to assign 'order_number' individualy for ID. any ideas how to fix it? should be a small change anywhere? thanks a lot!

 

with v1 as
(select distinct a.ID,  a.STEPNAME, a.RUNCOMPLETEDATETIME from a
group by lotid, stepname, RUNCOMPLETEDATETIME)

 

SELECT distinct lotid, STEPNAME, RUNCOMPLETEDATETIME,
Sum(flag)
Over ( ORDER BY lotid, RUNCOMPLETEDATETIME
ROWS Unbounded Preceding) +1 as order_number
FROM
(
SELECT lotid, STEPNAME, RUNCOMPLETEDATETIME,
CASE WHEN Min(CASE WHEN stepname LIKE any ('3010%','3001%','3000%','3120%') THEN stepname end)
Over (ORDER BY lotid, RUNCOMPLETEDATETIME
ROWS 1 Preceding ) IS NOT NULL
THEN 1
ELSE 0
END AS flag
FROM v1
) AS dt
group by 1,2,3,4;