Blog

The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

07-10-2011
04:17 PM

- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

07-10-2011
04:17 PM

When Flat SQL looping logic involves complex processing of data elements from multiple tables, the loops can almost always be transformed into Set SQL with derived tables. By eliminating looping logic, the performance benefits of parallel processing can be exploited.

Derived tables provide an easy technique to copy logic that is expressed in a cursor or a simple select statement in Flat SQL and paste it into Set SQL as part of the set logic. Expressions like this need to be modified slightly to enable joining of the derived tables to other tables in the Set SQL statement. Sometimes a derived table created in this way is so simple that it can be eliminated from the Set SQL by rewriting it as a join to a base table. In other cases, as the following example demonstrates, the derived table performs a necessary function that cannot be streamlined further.

This is an example of a PL/SQL procedure that uses data from more than one table to make decisions and compute an updated quantity for certain rows in a target table. Other rows in the target table are deleted.

`DECLARE`

CURSOR C1 IS

SELECT RowID, LLID, QtyOrd, DistQtyOrd, ShipQty, Unit_Price, DistID

FROM UNINVOICED

WHERE PER_KEY = :PerKey;

DIST_AMT NUMBER(22,5);

QTY_REC NUMBER(22,5);

AP_AMT NUMBER(22,5);

BEGIN

FOR UN_LIST IN C1 LOOP

` -- SUM Invoice Dist`

SELECT SUM(DECODE( SIGN(AcctgDt - :END_DATE)

,1,0,

NVL(QtyInvoiced,0)))

INTO AP_AMT

FROM INVOICE_DIST

WHERE DistID = UN_LIST.DistID;

` -- SUM RCV Txn`

SELECT SUM(NVL(RCT.DocQty,0))

INTO QTY_REC

FROM RCV_TRANSACTIONS RCT

WHERE RCT.TXN_TYPE IN ('RECEIVE','MATCH')

AND RCT.TXN_DATE < :END_DATE + 1

AND PO_LLID = UN_LIST.LLID;

` TOT_QTY := QTY_REC;`

` IF (TOT_QTY > AP_AMT) THEN`

IF (UN_LIST.ShipQty = 0) THEN

DIST_AMT := UN_LIST.Unit_Price

* ( (TOT_QTY - AP_AMT)

* (UN_LIST.DistQtyOrd

/ UN_LIST.QtyOrd) );

IF (DIST_AMT != 0) THEN

UPDATE UNINVOICED

SET func_curr_dist_amount = DIST_AMT

WHERE RowID = UN_LIST.RowID;

ELSE

DELETE FROM UNINVOICED

WHERE RowID = UN_LIST.RowID;

END IF;

ELSE

DELETE FROM UNINVOICED

WHERE RowID = UN_LIST.RowID;

END IF;

END IF;

` END LOOP;`

END;

To run this procedure efficiently on Teradata, the procedural logic has to be converted to Set SQL. Begin by locating the first insert, update or delete statement. At the end of this procedure is an UPDATE UNINVOICED statement. Following are two Delete statements, but the Update statement has to be handled first.

Note that there is only one cursor in this procedure, and that cursor selects from the table that is being Updated and Deleted. The normal approach for translating Insert statements is to copy and paste all necessary cursors into derived table expressions in the Set SQL. For Update and Delete statements it is not usually necessary to copy the cursor on the table being updated, because Flat SQL tends to use that cursor only as a placeholder for doing a single-row update or delete. The goal in Set SQL is to avoid single-row updates and deletes.

Under certain conditions the DIST_AMT column in the UNINVOICED table is updated with values that come from three sources: the UNINVOICED table itself (by way of the Cursor C1) and sums from two other tables. In the Set SQL, these two sums can be expressed as derived tables. Each of these sums is computed within the loop on the Cursor C1, based on values that are in the current row of the UNINVOICED table. Therefore, we have a way to join UNINVOICED to the two derived tables in the Set SQL.

The first derived table summarizes INVOICE_DIST:

` (SELECT SUM(DECODE( SIGN(AcctgDt - :END_DATE)`

,1,0,

NVL(QtyInvoiced,0)))

INTO AP_AMT

FROM INVOICE_DIST

WHERE DistID = UN_LIST.DistID) SumInvoice

Note that the derived table needs an alias. A term like SumInvoice helps to remind the reader what this does. The DECODE expression can be simplified with a CASE operation as follows:

` (SELECT SUM ( ( case`

when AcctgDt > :END_DATE then 0

else ( coalesce ( QtyInvoiced ,0 ) )

) end )

INTO AP_AMT

FROM INVOICE_DIST

WHERE DistID = UN_LIST.DistID) SumInvoice

There are a couple of things in this expression that need to be modified before it can be used as a derived table in Set SQL. First, a derived table cannot select INTO a host variable; rather, the result of this sum is going to be used in a higher expression, so it must be aliased to something that can be referenced outside the derived table. Second, the DistID in "UN_LIST" is not available for a predicate within the derived table; therefore, DistID must be added to the select list in this derived table so it can be used for a join in the outer expression.

A useable form of the derived table would look like this:

`(SELECT SUM ( ( case`

when AcctgDt > :END_DATE then 0

else ( coalesce ( QtyInvoiced, 0 ) )

) end )

,DIST_ID

GROUP BY DIST_ID

FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID )

Now AP_AMT is available for use in the outer expression. The other calculated value needed for the Update statement is TOT_QTY. TOT_QTY comes from QTY_REC, which comes form another summarization:

` -- SUM RCV Txn`

SELECT SUM(NVL(RCT.DocQty,0))

INTO QTY_REC

FROM RCV_TRANSACTIONS RCT

WHERE RCT.TXN_TYPE IN ('RECEIVE','MATCH')

AND RCT.TXN_DATE < :END_DATE + 1

AND PO_LLID = UN_LIST.LLID;

As with AP_AMT, this can be expressed as a derived table in this form :

`(SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )`

,PO_LLID

GROUP BY PO_LLID

FROM RCV_TRANSACTIONS

WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

AND RCT.TXN_DATE < :END_DATE + 1

) RcvTxn ( QTY_REC, PO_LLID )

Here PO_LLID will be used as a column to join to UNINVOICED.

Putting the derived tables together and adding the join predicates to the Update statement produces this:

`UPDATE UNINVOICED UN_LIST`

FROM

(

SELECT SUM ( ( case

when AcctgDt > :END_DATE then 0

else ( coalesce ( QtyInvoiced, 0 ) )

) end )

,DIST_ID

GROUP BY DIST_ID

FROM INVOICE_DIST

) SumInvoice ( AP_AMT, DIST_ID )

`,(`

SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )

,PO_LLID

GROUP BY PO_LLID

FROM RCV_TRANSACTIONS

WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

AND RCT.TXN_DATE < :END_DATE + 1

) RcvTxn ( QTY_REC, PO_LLID )

`WHERE SumInvoice.DistID = UN_LIST.DistID`

AND RcvTxn.PO_LLID = UN_LIST.LLID

There is a condition on the target table which is expressed in the cursor in original flat logic: WHERE PER_KEY = :PerKey - this must be added to the final predicate. There are also conditions under which the update is to take place, and these too must be expressed in the predicate.

Finally, note that there is a complex computation for func_curr_dist_amount. This expression must be copied into the SET clause of the Update statement.

Putting all this together produces:

`UPDATE UNINVOICED UN_LIST`

FROM

(

SELECT SUM ( ( case

when AcctgDt > :END_DATE then 0

else ( coalesce ( QtyInvoiced, 0 ) )

) end )

,DIST_ID

GROUP BY DIST_ID

FROM INVOICE_DIST

) SumInvoice ( AP_AMT, DIST_ID )

`,(`

SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )

,PO_LLID

GROUP BY PO_LLID

FROM RCV_TRANSACTIONS

WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

AND RCT.TXN_DATE < :END_DATE + 1

) RcvTxn ( QTY_REC, PO_LLID )

`WHERE SumInvoice.DistID = UN_LIST.DistID`

AND RcvTxn.PO_LLID = UN_LIST.LLID

` AND UN_LIST.PER_KEY = :PerKey`

AND RcvTxn.QTY_REC > SumInvoice.AP_AMT

AND UN_LIST.ShipQty = 0

AND UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT )

* ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd ) <> 0

`SET func_curr_dist_amount =`

UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT )

* ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd )

Now that the Set SQL statement is complete, it can be scanned for ways to simplify it. Sometimes the logic in case operations and where clauses can be reduced to shorter statements. In this example, we already simplified the one complex Decode function. Another way to simplify SQL is to eliminate simple derived tables, but these derived tables are not that simple: they are performing necessary aggregate functions. This statement cannot be streamlined any further.

Meanwhile, back to the procedure at hand, there are two more SQL statements that need to be handled. An examination of the logic shows that Delete statements are executed whenever either ShipQty is not 0 or DIST_AMT is 0. All the other data sources, logic and predicates are the same. So this requires a Delete statement that is similar to the Update statement but with a slightly different predicate:

`DELETE FROM UNINVOICED U`

WHERE U.PER_KEY = :PerKey

AND U.LLID IN (

Select LLID

FROM UNINVOICED UN_LIST

,(

SELECT SUM ( ( case

when AcctgDt > :END_DATE

then 0

else ( coalesce ( QtyInvoiced, 0 ) )

) end )

,DIST_ID

GROUP BY DIST_ID

FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID )

` ,(`

SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )

,PO_LLID

GROUP BY PO_LLID

FROM RCV_TRANSACTIONS

WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

AND RCT.TXN_DATE < :END_DATE + 1

) RcvTxn ( QTY_REC, PO_LLID )

` WHERE SumInvoice.DistID = UN_LIST.DistID`

AND RcvTxn.PO_LLID = UN_LIST.LLID

` AND RcvTxn.QTY_REC > SumInvoice.AP_AMT`

` AND ( UN_LIST.ShipQty <> 0`

OR UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT )

* ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd ) = 0

)

)

It may seem inefficient that this set logic has to issue two DML statements, an Update and a Delete, which results in two scans of the UNINVOICED table. The flat logic is designed to scan it just once. However, Teradata will execute these two Set SQL statements many times faster than it would execute the given procedure with its cursor-based loop. That is the power of massive parallel processing, which breaks each task into tens, hundreds or thousands of smaller pieces that can all be executed concurrently, and thus reduces the time to process large amounts of data by orders of magnitude.

The more complex flat logic is, the harder it is to visualize at first glance just how it could be expressed in Set SQL. The simple approach is to start developing the set logic from the flat logic with a step-by-step process as shown here. Once the Set SQL is complete, it is good practice to compare the Flat SQL and Set SQL side by side. This not only helps the developer verify that the two logics accomplish the same thing, it also helps train the mind to more rapidly conceive solutions in set logic.

In a future post we will examine an example of flat logic that is so complex it seems to defy the step-by-step approach to the translation to set logic. Sometimes, one has to just study the code until there is sufficient understanding of its intent and then rewrite it in Set SQL from scratch.

2 Comments

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.