Teradata Light-Weight Redistibutions

The best minds from Teradata, our partners, and customers blog about relevant topics and features.
Teradata Employee

A new feature in Teradata Database 15.0 gives the optimizer the choice to bypass all-AMP row redistributions under certain circumstances. This can be a good thing for queries that are redistributing just a few rows and the system has many AMPs. Without this feature all row redistributions, no matter of what size, would require that an AMP worker task be spawned and started up on each AMP in preparation for receiving rows.


What is Row Redistribution?

Rows are redistributed in preparation for a join when rows to be joined are located on different AMPs. What is actually redistributed are the spool file rows (the temporary intermediate files) that represent the columns that have been selected from the base table. The value of the join column in that spool file goes through the hashing algorithm and the AMP that is pointed to by the hash bucket (output from the hashing algorithm) is where an individual spool row is sent.














Rows are also redistributed during the global phase of aggregation. After each AMP has performed a local sort and aggregation on its data, it will send its subtotals to the AMP that will be performing grand totals. The work of calculating grand totals is shared across all AMPs, with each AMP working on a subset of the GROUP BY values.


Which portion of the work is performed by each AMP is determined by hashing the value of the GROUP BY column.  The output of the hashing algorithm will point to a single AMP.  In the example below all subtotals for CA on all AMPs go to AMP 0, all subtotals for FL on all AMPs go to AMP 1, and so forth.  With that approach all AMPs will be working in parallel on the global aggregation step.















Change with Light Weight Redistribution

Instead of spawning a receiver AMP worker task on each AMP in preparation for the row redistribution, the optimizer now has the option to suspend the start up of dedicated receiver tasks. Only the AMPs that actually will receive a row will have receiver tasks started up.


In order for this to take place, the sender must be a single-AMP. In considering the graphic below, if a query performs single-AMP primary index access on a NUPI, say an order-item table, then joins to the Product table light-weight redistribution may be used.  


The graphic below comes with these assumptions:

  • Access to the Order-Item table is by NUPI (single-AMP access)
  • More than one row is selected and these selected rows will be joined to the Product table
  • Those rows must be redistributed in order to join to the Product table
  • 5% or less of the AMPs are involved in receiving rows














Other characteristics of light-weight redistribution include:

  • The receiver tasks will be released as soon as the row has been received, rather than holding the AMP worker task until the step completes
  • The receiver tasks will be of the MessageWorkType01 category, just as in normal row redistribution
  • The optimizer goes through some calculations on selectivity and confidence before it decides to use light-weight redistributions (statistics collections need to be in place)
  • Even if the optimizer chooses light-weight redistribution, the AMP code may switch to traditional redistributions if during the process it find the number of rows being redistributed is higher than expected

Not only will light-weight redistributions free up AMP worker tasks for other work, it adds efficiency to the query that is executing. By removing the need to spawn all-AMP receiver steps up front, it bypasses the setup and tear-down of buffer structures on each AMP.  And if there is a congested AMP in the system, there is less likelihood of it causing delays.


Database Query Log

If the optimizer chooses to use light-weight redistribution for a particular step, this will be recorded in DBQStepTbl. In 15.0 the step table has a new field named  “DBQLStepTbl.SSRReceiverCount”.   If that field contains zero that means there was no light weight redistribution for that step.  If it is greater than zero, then it will tell you how many AMPs were involved in the light weight redistribution activity.


There is an internal DBS Control setting that controls the upper limit on the percent of AMPs that will receive rows and still qualify for light-weight redistribution. That upper limit is currently set at 5%.


Explain plan

The explain plan will indicate that light-weight redistribution was used by referencing “few AMPs” in the redistribution description in the step text:


1.) First, we do a single-AMP RETRIEVE step from CAB.A by way of the primary index "CAB.A.c1 = 1" with a residual condition of ("NOT (CAB.A.c2 IS NULL)") into Spool 2 (group_amps), which is redistributed by the hash code of (CAB.A.c2) to few AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with low confidence to be 2 rows (42 bytes). The estimated time for this step is 0.01 seconds.


1 Comment


Thaks for sharing...keep blogging.