The MultiLoad utility, whether used as part of the legacy stand-alone utility or as Teradata Parallel Transporter Update Operator, is designed to transfer large amounts of data from a client to the Teradata Database. In order to maximize throughput, MultiLoad allows the administrator to load data through more than one session.
The question this blog posting asks and then answers is this: If a MultiLoad job uses a low number of sessions, will that reduce the number of AMP worker tasks (AWTs) the job will require? In other words, is there a correlation between number of sessions started up by the load utility job and the number of AWTs it requires?
How Many AWTs
During the Acquisition phase, any MultiLoad job, regardless of how many sessions its uses, will require a sender AWT and a receiver AWT on each AMP. The sender AWT is a work00 work type and it accepts the rows from the client and redistributes them to the correct AMP based on PI value. The receiver AWT is a work01 work type and receives rows that have been sent from other AMPs, consolidates them, and writes them to disk.
You can assume each MultiLoad job will require two AWTs per AMP during the Acquisition phase, and one AWT per AMP during the Apply phase.
Sessions, Active and Inactive
If the number of sessions is equal to the number of AMPs in the configuration, then each AMP be associated to one of the sessions and will play an active role in accepting data from the client. If the number of sessions is less than the number of AMPs in the configuration, some AMPs will not be actively involved with a client session. Even in those cases, each AMP will still have a sender AWT set up.
Some of the MultiLoad sessions may show up as “inactive” through Viewpoint Query Session at any point in time. “Inactive” in this context means that the session/AMP is waiting for data from the client. “Active” means the AMP is actively processing data sent down from the client to that AMP. A particular session may be changing between active and inactive many times during a load job, especially if a high number of sessions was specified. A client machine (and/or the network) may not be powerful enough to keep all sessions/AMPs busy at the same time. Because the client sends data round-robin across all sessions, a different set of sessions/AMPs is likely to be active at a given point.
If only a few sessions are being used by the MultiLoad job, it is more likely that all will be active most of the time. Whether sessions are active or inactive or a combination, all AMPs will hold onto their sender and receiver AWTs for the duration of the Acquisition phase.
What Actually Happens During the Acquisition Phase
When a MultiLoad job is in the Acquisition phase, these things take place:
MultiLoad and MAPS
When the table being updated by a MultiLoad job resides in a map that covers only a subset of the AMPs in the configuration, only AMPs in that map will be used for the load activity. For example, if you are loading into a table located in TD_Map1, and TD_Map1 only covers half the AMPs in the current configuration (for example), then only the AMPs in TD_Map1 will be used to determine the default number of sessions and do the work to support the load job. Under those conditions, only the AMPs in TD_Map1 will require a sender and receiver AMP worker task to be active during the Acquisition phase.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.