Converting Complex Procedure Logic to Set SQL Using Derived Tables

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

When Flat SQL looping logic involves complex processing of data elements from multiple tables, the loops can almost always be transformed into Set SQL with derived tables. By eliminating looping logic, the performance benefits of parallel processing can be exploited.

Derived tables provide an easy technique to copy logic that is expressed in a cursor or a simple select statement in Flat SQL and paste it into Set SQL as part of the set logic.  Expressions like this need to be modified slightly to enable joining of the derived tables to other tables in the Set SQL statement.  Sometimes a derived table created in this way is so simple that it can be eliminated from the Set SQL by rewriting it as a join to a base table.  In other cases, as the following example demonstrates,  the derived table performs a necessary function that cannot be streamlined further.

This is an example of a PL/SQL procedure that uses data from more than one table to make decisions and compute an updated quantity for certain rows in a target table.  Other rows in the target table are deleted.

DECLARE

  CURSOR C1 IS

  SELECT RowID, LLID, QtyOrd, DistQtyOrd, ShipQty, Unit_Price, DistID

  FROM   UNINVOICED

  WHERE  PER_KEY = :PerKey;

DIST_AMT        NUMBER(22,5);

QTY_REC         NUMBER(22,5);

AP_AMT          NUMBER(22,5);

BEGIN

  FOR UN_LIST IN C1 LOOP

    -- SUM Invoice Dist

    SELECT SUM(DECODE( SIGN(AcctgDt - :END_DATE)

                       ,1,0,

                       NVL(QtyInvoiced,0)))

      INTO   AP_AMT

      FROM   INVOICE_DIST

      WHERE  DistID = UN_LIST.DistID;

    -- SUM RCV Txn

    SELECT SUM(NVL(RCT.DocQty,0))

      INTO QTY_REC

      FROM RCV_TRANSACTIONS RCT

      WHERE RCT.TXN_TYPE IN ('RECEIVE','MATCH')

      AND   RCT.TXN_DATE < :END_DATE + 1

      AND PO_LLID = UN_LIST.LLID;

    TOT_QTY := QTY_REC;

    IF (TOT_QTY > AP_AMT) THEN

      IF (UN_LIST.ShipQty = 0) THEN

        DIST_AMT := UN_LIST.Unit_Price

                     * ( (TOT_QTY - AP_AMT)

                     * (UN_LIST.DistQtyOrd

                        / UN_LIST.QtyOrd) );

        IF (DIST_AMT != 0) THEN

          UPDATE UNINVOICED

          SET    func_curr_dist_amount = DIST_AMT

          WHERE  RowID = UN_LIST.RowID;

        ELSE

          DELETE FROM  UNINVOICED

          WHERE  RowID = UN_LIST.RowID;

        END IF;

      ELSE

          DELETE FROM  UNINVOICED

          WHERE  RowID = UN_LIST.RowID;

      END IF;

    END IF;

  END LOOP;

END;

To run this procedure efficiently on Teradata, the procedural logic has to be converted to Set SQL.  Begin by locating the first insert, update or delete statement.  At the end of this procedure is an UPDATE UNINVOICED statement.  Following are two Delete statements, but the Update statement has to be handled first.

Note that there is only one cursor in this procedure, and that cursor selects from the table that is being Updated and Deleted.  The normal approach for translating Insert statements is to copy and paste all necessary cursors into derived table expressions in the Set SQL.  For Update and Delete statements it is not usually necessary to copy the cursor on the table being updated, because Flat SQL tends to use that cursor only as a placeholder for doing a single-row update or delete.  The goal in Set SQL is to avoid single-row updates and deletes.

Under certain conditions the DIST_AMT column in the UNINVOICED table is updated with values that come from three sources: the UNINVOICED table itself (by way of the Cursor C1) and sums from two other tables.  In the Set SQL, these two sums can be expressed as derived tables.  Each of these sums is computed within the loop on the Cursor C1, based on values that are in the current row of the UNINVOICED table.  Therefore, we have a way to join UNINVOICED to the two derived tables in the Set SQL.

The first derived table summarizes INVOICE_DIST:

    (SELECT SUM(DECODE( SIGN(AcctgDt - :END_DATE)

                       ,1,0,

                       NVL(QtyInvoiced,0)))

      INTO   AP_AMT

      FROM   INVOICE_DIST

      WHERE  DistID = UN_LIST.DistID) SumInvoice

Note that the derived table needs an alias.  A term like SumInvoice helps to remind the reader what this does.  The DECODE expression can be simplified with a CASE operation as follows:

    (SELECT SUM ( ( case

                   when AcctgDt > :END_DATE then 0

                   else ( coalesce ( QtyInvoiced ,0 ) )

                 ) end )

      INTO   AP_AMT

      FROM   INVOICE_DIST

      WHERE  DistID = UN_LIST.DistID) SumInvoice

There are a couple of things in this expression that need to be modified before it can be used as a derived table in Set SQL.  First, a derived table cannot select INTO a host variable; rather, the result of this sum is going to be used in a higher expression, so it must be aliased to something that can be referenced outside the derived table.  Second, the DistID in "UN_LIST" is not available for a predicate within the derived table; therefore, DistID must be added to the select list in this derived table so it can be used for a join in the outer expression.

A useable form of the derived table would look like this:

(SELECT SUM ( ( case

               when AcctgDt > :END_DATE then 0

               else ( coalesce ( QtyInvoiced, 0 ) )

             ) end )

       ,DIST_ID

GROUP BY DIST_ID

FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID )

Now AP_AMT is available for use in the outer expression.  The other calculated value needed for the Update statement is TOT_QTY.  TOT_QTY comes from QTY_REC, which comes form another summarization:

    -- SUM RCV Txn

    SELECT SUM(NVL(RCT.DocQty,0))

      INTO QTY_REC

      FROM RCV_TRANSACTIONS RCT

      WHERE RCT.TXN_TYPE IN ('RECEIVE','MATCH')

      AND   RCT.TXN_DATE < :END_DATE + 1

      AND PO_LLID = UN_LIST.LLID;

As with AP_AMT, this can be expressed as a derived table in this form :

(SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )

        ,PO_LLID

GROUP BY PO_LLID

FROM RCV_TRANSACTIONS

WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

 AND  RCT.TXN_DATE < :END_DATE + 1

) RcvTxn ( QTY_REC, PO_LLID )

Here PO_LLID will be used as a column to join to UNINVOICED.

Putting the derived tables together and adding the join predicates to the Update statement produces this:

UPDATE UNINVOICED UN_LIST

FROM

 (

SELECT SUM ( ( case

               when AcctgDt > :END_DATE then 0

               else ( coalesce ( QtyInvoiced, 0 ) )

             ) end )

       ,DIST_ID

GROUP BY DIST_ID

FROM INVOICE_DIST

) SumInvoice ( AP_AMT, DIST_ID )

,(

SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )

        ,PO_LLID

GROUP BY PO_LLID

FROM RCV_TRANSACTIONS

WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

 AND  RCT.TXN_DATE < :END_DATE + 1

) RcvTxn ( QTY_REC, PO_LLID )

WHERE SumInvoice.DistID = UN_LIST.DistID

 AND  RcvTxn.PO_LLID = UN_LIST.LLID

There is a condition on the target table which is expressed in the cursor in original flat logic: WHERE  PER_KEY = :PerKey - this must be added to the final predicate.  There are also conditions under which the update is to take place, and these too must be expressed in the predicate.

Finally, note that there is a complex computation for func_curr_dist_amount.  This expression must be copied into the SET clause of the Update statement.

Putting all this together produces:

UPDATE UNINVOICED UN_LIST

FROM

 (

SELECT SUM ( ( case

               when AcctgDt > :END_DATE then 0

               else ( coalesce ( QtyInvoiced, 0 ) )

             ) end )

       ,DIST_ID

GROUP BY DIST_ID

FROM INVOICE_DIST

) SumInvoice ( AP_AMT, DIST_ID )

,(

SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )

        ,PO_LLID

GROUP BY PO_LLID

FROM RCV_TRANSACTIONS

WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

 AND  RCT.TXN_DATE < :END_DATE + 1

) RcvTxn ( QTY_REC, PO_LLID )

WHERE SumInvoice.DistID = UN_LIST.DistID

 AND  RcvTxn.PO_LLID = UN_LIST.LLID

 AND  UN_LIST.PER_KEY = :PerKey

 AND  RcvTxn.QTY_REC > SumInvoice.AP_AMT

 AND  UN_LIST.ShipQty = 0

 AND  UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT )

      * ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd )  <> 0

SET func_curr_dist_amount =

    UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT )

      * ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd )

Now that the Set SQL statement is complete, it can be scanned for ways to simplify it.  Sometimes the logic in case operations and where clauses can be reduced to shorter statements.  In this example, we already simplified the one complex Decode function.  Another way to simplify SQL is to eliminate simple derived tables, but these derived tables are not that simple: they are performing necessary aggregate functions.  This statement cannot be streamlined any further.

Meanwhile, back to the procedure at hand, there are two more SQL statements that need to be handled.  An examination of the logic shows that Delete statements are executed whenever either ShipQty is not 0 or DIST_AMT is 0.  All the other data sources, logic and predicates are the same.  So this requires a Delete statement that is similar to the Update statement but with a slightly different predicate:

DELETE FROM UNINVOICED U

WHERE U.PER_KEY = :PerKey

 AND  U.LLID IN (

    Select LLID

    FROM UNINVOICED UN_LIST

    ,(

    SELECT SUM ( ( case

            when AcctgDt > :END_DATE

            then 0

            else ( coalesce ( QtyInvoiced, 0 ) )

            ) end )

           ,DIST_ID

    GROUP BY DIST_ID

    FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID )

    ,(

    SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) )

            ,PO_LLID

    GROUP BY PO_LLID

    FROM RCV_TRANSACTIONS

    WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' )

     AND  RCT.TXN_DATE < :END_DATE + 1

    ) RcvTxn ( QTY_REC, PO_LLID )

    WHERE SumInvoice.DistID = UN_LIST.DistID

     AND  RcvTxn.PO_LLID = UN_LIST.LLID

     AND  RcvTxn.QTY_REC > SumInvoice.AP_AMT

     AND ( UN_LIST.ShipQty <> 0

           OR  UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT )

              * ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd )  = 0

         )


)

It may seem inefficient that this set logic has to issue two DML statements, an Update and a Delete, which results in two scans of the UNINVOICED table.  The flat logic is designed to scan it just once.  However, Teradata will execute these two Set SQL statements many times faster than it would execute the given procedure with its cursor-based loop.  That is the power of massive parallel processing, which breaks each task into tens, hundreds or thousands of smaller pieces that can all be executed concurrently, and thus reduces the time to process large amounts of data by orders of magnitude.

The more complex flat logic is, the harder it is to visualize at first glance just how it could be expressed in Set SQL.  The simple approach is to start developing the set logic from the flat logic with a step-by-step process as shown here.  Once the Set SQL is complete, it is good practice to compare the Flat SQL and Set SQL side by side.  This not only helps the developer verify that the two logics accomplish the same thing, it also helps train the mind to more rapidly conceive solutions in set logic.

In a future post we will examine an example of flat logic that is so complex it seems to defy the step-by-step approach to the translation to set logic.  Sometimes, one has to just study the code until there is sufficient understanding of its intent and then rewrite it in Set SQL from scratch.

2 Comments
Great blog !! If i am not missing something, there seems to be a typo in the original PL/SQL code.
SELECT SUM(DECODE( SIGN(AcctgDt - :END_DATE)
,1,0,
NVL(QtyInvoiced,0)))
INTO AP_AMT
FROM INVOICE_DIST
WHERE DistID = UN_LIST.DistID;
END IF;

There is no no matching IF statement for the END IF; Sorry, if i am missing something.

Teradata Employee
No, that's right - it was a scrap left over from the original code on which this example is based. I've corrected it, and I hope no one was confused by that. Thank you for catching it, and thank you for the compliment!