Stored Procedure or Sql

UDA

Stored Procedure or Sql

I have one table in Teradata and I need to process all records from this table and create a new table.The business requirement is such that I need to sort the records based on two columns(for the ease of explanation,consider this as key) and for the same key combination I need to pick few columns from first record,few columns from the last record and and also aggregrate few columns from first to last record (aggregate in between records) and then write this as a single record in output file.

This is best illustrated with the help of an example

Card Origin Destination Origin Destination Amount
Number Time Time ID ID

ABC Time1 Time2 X12 Y78 10
ABC Time3 Time4 X34 Y65 20
ABC Time5 Time6 X56 Y98 30

DEF Time1 Time2 D12 E45 50
DEF Time3 Time4 D34 E78 20

GHI Time1 Time2 D45 E78 10
GHI Time3 Time4 X12 E34 30
GHI Time5 Time6 X34 E12 40
GHI Time7 Time8 X56 E45 50

Business Rule :- Difference between Time2 and Time3 should be less than 30 minutes and Difference between Time1 and Time6 should be less than 3 hours

The output table should have below details
For Card ABC, Origin Time.Origin ID of First record is picked, Destination time, Destination ID of last record is picked and Amount field is aggregated

Card Origin Destination Origin Destination Amount
Number Time Time ID ID

ABC Time1 Time6 X12 Y98 60
DEF Time1 Time4 D12 E78 70
GHI Time1 Time8 D45 E45 130

I am planning to write a stored procedure, but since Stored procedure will be sequential read and single record processing at a time, it will be very time consuming and would also not use the teradata parallelism.

Can this be done in a normal sql query or is there any better way of doing this other than writing a stored procedure?
2 REPLIES
Enthusiast

Re: Stored Procedure or Sql

Definitely SQL. Only use a stored procedure when there is some form of looping required and the code within the loop is based on changes in previous records.
If you are going to do this in a proc, you will have to loop through the whole file.

Your rules are not really clear (like what happens then there are more or less than 6 records and what if the difference between Time 2 and Time 3 is more than 30 mins,...)
but I suggest it will be something like:

Create Volatile Table CardTransits
(Card_Number Char(10) Not Null
, RowNumUp Smallint Not Null
, RowNumDn Smallint Not Null
, Origin_Time TimeStamp(0) Not Null
, Dest_Time TimeStamp(0) Not Null
, Origin_Id Char(3) Not Null
, Dest_Id Char(3) Not Null
, Total_Amt Integer Not Null)
Primary Index(Card_Number)
On Commit Preserve Rows
;

-- Number all the rows, get the total amount

Insert Into Card_Transits
(Card_Number
, RowNumUp
, RowNumDn
, Origin_Time
, Dest_Time
, Origin_Id
, Dest_Id
, Total_Amt )
Select
D1.Card_Number
, Row_Number() Over (Partition By Tbl.Card_Number
Order By Tbl.Origin_Time ASC)
, Row_Number() Over (Partition By Tbl.Card_Number
Order By Tbl.Origin_Time Desc)
, Tbl.Origin_Time
, Tbl.Dest_Time
, Tbl.Origin_Id
, Tbl.Dest_Id
, D1.Tot_Amt

From Tbl
Inner Join
(Select
Card_Number
, Sum(Amount) as Tot_Amt
From Tbl
Group By 1) D1
On Tbl.Card_Number = D1.Card_Number
;

-- identify the cards which meet time criteria

Create Volatile Table GoodCards
As (
Select Fst.Card_Number
From Card_Transits T1
Inner Join Card_Transits T2
On T1.Card_Number = T2.Card_Number
And T1.RowNumUp = 1
And T2.RowNumUp = 2
And T2.Origin_Time - T1.Dest_Time LT Interval(2) '30' Minute
-- Less than 30 minutes between first and second
Inner Join Card_Transits T3
On T1.Card_Number = T3.Card_Number
And T3.RowNumDn = 1
And T3.Dest_Time - T1.Origin_Time LT Interval(1) '3' Hour
)
WITH DATA
Unique Primary Index(Card_Number)
On COMMIT PRESERVE ROWS
;

Select
T1.Card_Number
, Max(CASE WHEN T1.RowNumUp = 1
THEN Origin_Time END) As Origin_Time
, Max(CASE WHEN T1.RowNumDn = 1
THEN Dest_Time END) As Dest_Time
, Max(CASE WHEN T1.RowNumUp = 1
THEN Origin_Id END) As Origin_Id
, Max(CASE WHEN T1.RowNumDn = 1
THEN Dest_Id END) As Dest_Id
, Max(Total_Amt)

From GoodCards T0
INNER JOIN Card_Transits T1
ON T0.Card_Number = T1.Card_Number
Group By 1
;

You can do it all in a single query, but it is more understandable this way, and support will hate you far less when there is an overnight problem.
If you want to put it in a view, get it working like this, then use derived tables instead of volatile.
Enthusiast

Re: Stored Procedure or Sql

Hi,
You can achieve this by using what is called a Partitioned Sum Window function. Use the the grouping key as "ABC", "..." etc and use the MIN and MAX functions to get your first and last times and use the SUM function to get your total amount sum per key. You can go through the documentation on how to write Partition Window functions using SQL. This would be your best bet and the entire result will be derived from one full table scan for all your data. The only constraint you may see is the Spool Space allocated to your user which may cause issues. Otherwise the query performance will be quite good too.