The MultiLoad utility, whether used as part of the legacy stand-alone utility or as Teradata Parallel Transporter Update Operator, is designed to transfer large amounts of data from a client to the Teradata Database. In order to maximize throughput, MultiLoad allows the administrator to load data through more than one session.
The question this blog posting asks and then answers is this: If a MultiLoad job uses a low number of sessions, will that reduce the number of AMP worker tasks (AWTs) the job will require? In other words, is there a correlation between number of sessions started up by the load utility job and the number of AWTs it requires?
Most of the resources of a system are consumed by requests running in workloads that were defined by the administrator, workloads that are managing user-initiated work. However, some small percentage of resources, mostly CPU, is utilized by what we call internal workloads. This posting will help you understand what kinds of activities are running in those internal workloads, and some of the reasons why utilization there can and will fluctuate.
Throttles are a workload management technique for controlling concurrency in a data warehouse. When a throttle is an active part of a TASM or TIWM ruleset, a counter is kept of how many requests are currently running that are under its control. When a request wants to begin to execute that would cause this counter to exceed the throttle’s defined limit, that request is placed in delay queue.
It is not unusual for a request to be under the control of more than one throttle, so when you are analyzing throttle impact after the fact, it can be difficult to know which throttle was responsible for the delay action. There is a field in DBQLogTbl named TDWMRuleID that can aid you in making that determination.
With Teradata MAPS Feature, you can expand the hardware configuration and choose to postpone the redistribution of tables from the old AMPs to the new AMPs. This delay in moving rows to the new AMPs can provide a significate reduction in down time during the expansion window. Postponing table redistributions is enabled by allowing multiple hash maps (the old hash map that covered the previous configuration and the new hash map for all of the AMPs in the new configuration) to co-exist at the same time. It will be up to the administrator to decide when to move tables into the new, larger map.
The move of a table from the old map to the new map is accomplished by a new type of ALTER TABLE command. When an ALTER TABLE statement that includes a “MAPS = map_name” clause is issued, that ALTER TABLE will be processed similar to an INSERT-SELECT.
The INSERT-SELECT moves tables from one map to a different map is essentially doing the same thing as the standard INSERT-SELECT. However, there are a few differences, which will be highlighted in this posting.
Sparse maps are one of the new functionalities available in the MAPS Architecture feature that is part of the Teradata Database 16.10 release. All Teradata deployment platforms will support the use of sparse maps as soon as the platform gets onto 16.10 software.
Sparse maps are simple to understand, easy to use, and can provide benefit when a very small table is frequently or repetitively accessed. Basically, sparse maps allows you to move very small tables onto one or a few AMPs on the system, rather than thinly spreading the table’s rows across all AMPs.
Most sites have a few (or many) very small tables with fewer rows than AMPs in the configuration. Having to perform an all-AMP operation every time those types of tables are read involves some level of activity across all AMPs. Some of those AMPs will be wasting their resources because they have no rows. That can add up, especially on systems that include a large number of AMPs.
This posting discusses key points to keep in mind before you begin moving your small tables into sparse maps.
I came across an interesting behavior related to Teradata workload management. This behavior is reflected in both TASM and TIWM. It has to do with using throttles to delay all queries that include access to a certain table (or set of tables) at times when those tables are being dropped and recreated.
Statistical information is vital for the optimizer when it builds query plans. But collecting statistics can involve time and resources. By understanding and combining several different statistics gathering techniques, users of Teradata can find the correct balance between good query plans and the time required to ensure adequate statistical information is always available.
Teradata Database workload management offers a new feature starting in Teradata Database 15.10 for managing the throttle delay queue. It’s called “Prioritized Delay Queue”. This posting looks at how the Prioritized Delay Queue feature works and some things to keep in mind if you begin to use it. The intent of this enhancement is to ensure that higher priority work is able to be released from the delay queue ahead of lower priority work.
The ResUsageSAWT logs detailed information about AMP worker tasks (AWTs). Because AWTs are a finite resource, most sites keep on eye when they are close to running out.
ResUsageSAWT reports in-use and max AWT counts for all of the 16 message work types on each AMP and reports them at the end of each logging interval. It also includes a column InuseMax that reflects the maximum number of AWTs in combination that were in use at any one time during the log period. This posting is intended to clear up any confusion over what InuseMax offers, when it’s useful, and when other AWT metrics are more suitable.