Maximizing the full potential of Teradata's Massive Parallel Processing requires Set Processing.
Updating thousands of rows one row at a time in Teradata is like fetching a gallon of water with a spoon. Serial, or row-at-a-time-based, processes cannot be parallelized, but Teradata can split set-based processes into dozens or hundreds of concurrent processes, each operating on a subset of the data. Enabling this parallelism can reduce processing time by orders of magnitude - from hours to seconds. We call this Set Processing, and its use allows the Teradata engine to make its greatest impact on throughput and performance.
Designing a procedure that processes a whole set of data at a time rather than one record at a time requires the ability to think in terms of set logic. Set logic involves expressions and operations that apply to sets of data rather than to a single item of data. In the card game, Euchre, the first task is to "take all the cards numbered 2 to 8 from the deck." That is a set logic expression. On the other hand, "Pick up the deck; take a card from the deck; look at the card; if it is numbered between 2 and 8 set it aside; take another card; etc." is procedural logic. This is the kind of logic that is needed for reading and processing a table with a cursor loop, processing one row at a time, and basically treating the table as if it were a flat file. I refer to this sort of procedural logic as "flat logic."
SQL is the language of set logic in database applications, but SQL can also support flat logic with the use of cursors. Flat SQL is cursor-based SQL; Set SQL operates only on sets of data. Flat SQL puts the processing logic in the program; Set SQL moves the processing logic to the database system, where the optimizer will decide the most efficient path.
Using the Euchre example and thinking of the deck as a table, the Set SQL expression might be, "Delete from deck where card_number between 2 and 8." Isn’t that easy? Flat SQL would require the declaration of a cursor such as "Select * from deck;" then a loop on the cursor with a statement like "If card-number >= 2 and card-number <= 8 then delete from deck where current of cursor." Isn't that harder? (Oh, wait, I got a syntax error on compile....)
The Set SQL tells the database system what to do, but the Flat SQL has to provide a method for doing it. The Set SQL executes on several Teradata processors at once. The Flat SQL executes on just one processor at a time.
Converting a program from flat logic means rewriting the program logic in Set SQL language, which requires a shift in mindset to set logic and a re-orientation toward Set SQL language. Flat logic is ingrained in a programmer from years of experience; however, most people find that Set SQL is easier to write and maintain after some practice. This blog will provide guidance in migrating code to Set SQL and developing new Set SQL processes.
In the next installment we will look at a simple approach that you can use to translate logic and nested loops into Set SQL. Then we will move on to more complex examples.