Restricting Table Operators By Map

Blog
The best minds from Teradata, our partners, and customers blog about relevant topics and features.
Teradata Employee

A new capability is available in 16.10 that will allow you to specify which map a table operator will use when it executes. This capability is activated by adding an "EXECUTE  MAP = map-name" clause in the definition of the table operator, or in the SQL used to execute the table operator.  This posting describes this capability and its value when executing table operators.

 

The map that you specify can be any type of map: contiguous or sparse. Only the AMPs that belong to that map will be used to support the execution of the table operator. The new capability is only available for table operators, not for other types of functions, and not for standard SQL statements. For a foreign server, the map applies to the table operators associated with the foreign server.

 

Why Would I Want to Do This?

Here's an example of when this option might be useful.

 

Specifying EXECUTE MAP causes the table operator to only execute only on the AMPs in the named map. If you have a table operator that you suspect will use a lot of resource, including memory, you can damper down the impact on other active work by only running the table operator on a subset of AMPs, rather than all the AMPs in the configuration.

To keep the resource usage even across all AMPs, consider a sparse map.  We don't usually think about using sparse maps on work that will use a lot of resource.  And certainly, a single-AMP sparse map would be a poor-performing choice for any table operator.

 

But what about that second sparse map that all systems (16.10 and above) come with by default? The one-AMP-per-node sparse map.  If you specify a one-AMP or even or two-AMP-per-node sparse map in your CREATE FUNCTION statement, then the table operator will be spread across all nodes, but you've reined in the parallelism on each node down to one AMP or two AMPs per node.  That will tend to reduce demand for memory and CPU on each node when the table operator is running.

 

Here's how a SCRIPT table operator would look using this technique:

 

SELECT * FROM SCRIPT(

ON tab1

SCRIPT_COMMAND('cat')

RETURNS('*')

DELIMITER(',')

EXECUTE MAP = TD_4AmpSparseMap_4Node

) AS a;

 

How It Works

A table operator reads from either a table or a spool. If its reading from a table, the map that table belongs to will be used for the read activity, as defined by the table definition.  But any spool file that is created will be distributed only to the AMPs that are part of the map specified in the function definition.

 

Here's an Explain that illustrates the table operator processing is happening in the sparse map:

 

EXPLAIN

SELECT * FROM SCRIPT(

ON tab1

SCRIPT_COMMAND('cat')

RETURNS('*')

DELIMITER(',')

EXECUTE MAP = TD_4AmpSparseMap_4Node

) AS a;

 

 ***Success: Activity Count = 19

 

Explanation                                                                        

---------------------------------------------------------------------------------  

  1) First, we lock FSK.tab1 in TD_MAP1 for read on a reserved RowHash  

     to prevent global deadlock.  

  2) Next, we lock FSK.tab1 in TD_MAP1 for read.  

  3) We do an all-AMPs RETRIEVE step in TD_MAP1 from FSK.tab1 by way of  

     an all-rows scan with no residual conditions into Spool 1 (used to  

     materialize view, derived table, table function or table operator  

     TblOpInputSpool) (all_amps), which is redistributed randomly to  

     all AMPs in TD_4AmpSparseMap_4Node.  The size of Spool 1 is  

     estimated with low confidence to be 4 rows (132 bytes).  

  4) We do a single-AMP RETRIEVE step from Spool 1 (Last Use) by way of  

     an all-rows scan executing table operator TD_SYSFNLIB.SCRIPT in  

     TD_4AmpSparseMap_4Node with a condition of ("(1=1)") into Spool 3  

     (group_amps), which is built locally on that AMP.  The size of  

     Spool 3 is estimated with low confidence to be 4 rows (132 bytes).  

     The estimated time for this step is 0.16 seconds.  

  5) Finally, we send out an END TRANSACTION step to all AMPs involved  

     in processing the request.  

  -> The contents of Spool 3 are sent back to the user as the result of  

     statement 1.  

 

Privileges

If you want to use a MAPS clause when you create table operators, you must have been granted the map that you plan to use.  If you have created the map yourself, you must have been granted CREATE MAP privileges.  After you have created that map, anyone that wants to use the map must be granted the map. 

This gives you control over who can use a sparse map that you have created.

 

Application DBAs may wish to create their own maps just for table operator processing, and they would then be the one to issue the GRANT MAP map-name to others, as needed. Or the lead DBA could create the map then grant it to the application DBA.

 

You can specify this new statement within the SQL that executes the table operator.  However, if the EXECUTE MAP clause is included in the table operator definition, it will not also be needed in the SQL statements.

 

Conclusion

Table operators can be resource-intensive.  One technique for managing such table operators is to ensure that they execute only on a subset of AMPs in the configuration.  Sparse maps that are defined on one or more AMPs per node are well-suited to both parallelize the activity across all nodes and at the same time limit the level of resources demanded from the nodes.