Introducing Sparse Maps in Teradata 16.10

Blog
The best minds from Teradata, our partners, and customers blog about relevant topics and features.
Teradata Employee

Sparse maps are one of the new functionalities available in the MAPS Architecture feature that is part of the Teradata Database 16.10 release. All Teradata deployment platforms will support the use of sparse maps as soon as the platform gets onto 16.10 software.

 

Sparse maps are simple to understand, easy to use, and can provide benefit when a very small table is frequently or repetitively accessed. Basically, sparse maps allows you to move very small tables onto one or a few AMPs on the system, rather than thinly spreading the table’s rows across all AMPs. 

 

Most sites have a few (or many) very small tables with fewer rows than AMPs in the configuration. Having to perform an all-AMP operation every time those types of tables are read involves some level of activity across all AMPs.  Some of those AMPs will be wasting their resources because they have no rows.  That can add up, especially on systems that include a large number of AMPs.  

 

This posting discusses key points to keep in mind before you begin moving your small tables into sparse maps.

 

Default Sparse Maps

The first thing to be aware of is this: You don’t have to create sparse maps on your Teradata system. You get two sparse maps by default as soon as you get on 16.10:  A single-AMP sparse map and a multi-AMP sparse map.  The multi-AMP sparse map will include one AMP for each of the nodes in the parent contiguous map’s configuration.

 

Each of the two sparse maps come with a self-describing name. I’ve broken the sparse map name into its sub-parts so you can better understand what the sparse map name is telling you:

  • “TD_” - This first part of an internally-created map name tells you this object has been created by the Teradata Database, rather than a user. All internally-created map names start with the “TD_” string.
  • “nAmp” - This next string in the internal map name identifies the number of AMPs being used in this particular sparse map. An internally created 1-AMP sparse map, for example, will always have a map name that starts with “TD_1AMP”. A multi-AMP sparse map on a 4-node system will start with “TD_4AMP”.
  • “SparseMap” – This string that comes next in the sparse map name identifies the object as a sparse map. It will be the same string for all sparse maps.
  • “_nNodes” – This final string in the sparse map name identifies the number of nodes in the configuration that is the parent of the sparse map.

The default 1-AMP sparse map on a 4-node system is named:

TD_1AmpSparseMap_4Nodes  

 

The default multi-AMP default sparse map on a 4-node system is named:

TD_4AmpSparseMap_4Nodes

 

How Small Tables are Assigned to AMPs

Sparse maps are always created inside of a parent contiguous map. When you first get onto 16.10 the contiguous map that owns the default sparse maps will by TD_Map1, the map that covers all the AMPs in the current configuration.

 

It’s easier to understand how tables are assigned to AMPs in a sparse map if we consider 1-AMP sparse maps initially. If you have a table with fewer rows than AMPs, it could be a good candidate to move to a single-AMP sparse map.  At the time you move it to the sparse map, the database will take the database name and table name and put them through a hashing algorithm.  The output will determine which AMP within the parent contiguous map will support that particular small table’s rows.  Other small tables can use the same 1-AMP sparse maps, but under usual conditions will be assigned to different AMPs, based on the their database/table name combination.

 

A single one-AMP sparse map is designed in a way to easily support many small tables without overloading one AMP in the system. By spreading out the table-to-AMP assignments across the AMPs in the parent contiguous map, the likelihood that one single AMP will be burdened with more processing activity is reduced.

 

 SparseMapTables.jpg

 

 

Which Tables Belong in Sparse Maps

I’d suggest you initially only focus on moving really small tables into your default single-AMP sparse map. Look for tables with fewer rows than you have AMPs.  And at first consider only  small tables that are frequently accessed.  When you move a 5-row Promotion table into a single-AMP sparse map you’ll be saving some system resources (AMP worker tasks, CPU, IO) every time you access that table.  Because all the rows will now be on one AMP you’ve got a single data block to access instead of one data block per AMP.  In addition, fewer AMP worker tasks will be used system-wide (although in most cases this is for a short period of time).

 

Plus another big win is if you are using the Promotion table in a high volume tactical application you can expect more consistent response time when its in a single AMP sparse map, since only one AMP is touched and a single block is read. This reduces potential elapsed time variance caused by one AMP or another being delayed during a read step. If your workload management setup runs all single-AMP requests at a high “tactical” priority, you’ve also given a big priority boost to any request that only accesses this Promotion table. 

 

Moving tables into sparse maps can be done through the new Viewpoint MAPs Manager portlet. There is also a mover stored procedure that comes with the 16.10 version of the database that will move tables for you without having to log on Viewpoint.  The nice thing about the Viewpoint portlet is that it can find all the small tables for you and recommend the ones that belong in either the 1-AMP or the multi-AMP sparse maps.   That could simplify the effort to identify and move small tables at your site.

 

Where to Be Cautious

The one thing that can hurt you when moving tables into sparse maps is the table size itself. You risk degraded performance if you move a table that is not at the acceptable level of smallness into a sparse map. You could be creating a new processing bottleneck.  This is especially true for 1-AMP sparse maps.

 

For that reason it is recommended that only tables 128KB or smaller in size be considered as candidates 1-AMP for sparse maps. If you have a table larger than 128KB that you are considering for a multi-AMP sparse map, the table should be no larger than 128KB * number of AMPs in the sparse map (in other words, each AMP in the sparse map will have about 128KB of data).  

 

This size recommendation refers to the perm size of the small table minus table header overhead. Only the actual “data” minus header overhead is under consideration here.

 

The Viewpoint portlet will identify small tables that qualify using these recommendations. In addition, there is a view that comes with the 16.10 database that will provide you a list of candidate tables that are suited for sparse maps, and do this space-related calculation for you, including the subtraction of the table header overhead.  That view is named:TDMaps.TableToSparseMapSizingV.

 

There is one other thing to watch that is related to size: Small table can become big tables.  When you moved a table into a sparse map it may have met the guidelines just fine.  But over time it may have grown.  The Viewpoint portlet makes it easy to periodically check if your small tables are still good candidates for the sparse maps you have moved them to.  Or you can rerun the view above at convenient times to make sure there aren’t any looming performance issues associated with sparse map usage.

 

One last word of caution. Don’t assume empty tables all belong in sparse maps.  Some empty tables may be empty by day, but used for ETL processing by night.   Staging tables would fall into that category.  You could set yourself up for a bad surprise during the load window if you moved such a table into a 1-AMP sparse map.

 

You’ll have a higher degree of success managing sparse maps and small tables if you clean up your databases first. Try to find and drop small or empty tables that are hanging around unnecessarily and that no longer have a purpose. 

 

2 Comments
Enthusiast

Hi Carrie, 

 

How the data skew is treated in Sparse map in case i have single AMP or multi-AMP sparse map ?

 

Thanks,

Monoranjan

Teradata Employee

Monoanjan,

 

Because only very small tables are recommended for placement in sparse maps, and because different small tables are placed on different AMPs in the same sparse map, skew is not expected to be an issue. For example, there is likely to be only a single data block representing a small table on a single-AMP sparse map, and the CPU time to process those few rows in the data block will be very very low. While testing sparse map performance where we had several very small tables in single AMP sparse maps, we did not see any perceivable skew because of that, as the duration of the step that accessed the very very small table was so short compared to the other work going on in the query.

 

However, it is possible if you have an application with high concurrency where repetitive queries are accessing a very small table in a sparse map, then it is possible yuo might see slightly more activity on that AMP. But it would still be preferable to have that table spread across all-AMPs because you have eliminated the same repetitive access most of the AMPs. 

 

Suggest you try it out and see if you detect any issues. It’s quick and easy to move a very small table to a sparse map, and then back again.

 

Thanks, -Carrie