Number of MultiLoad Sessions vs. Number of AMP Worker Tasks

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

The MultiLoad utility, whether used as part of the legacy stand-alone utility or as Teradata Parallel Transporter Update Operator, is designed to transfer large amounts of data from a client to the Teradata Database. In order to maximize throughput, MultiLoad allows the administrator to load data through more than one session.

 

The question this blog posting asks and then answers is this: If a MultiLoad job uses a low number of sessions, will that reduce the number of AMP worker tasks (AWTs) the job will require? In other words, is there a correlation between number of sessions started up by the load utility job and the number of AWTs it requires?

 

How Many AWTs

 

During the Acquisition phase, any MultiLoad job, regardless of how many sessions its uses, will require a sender AWT and a receiver AWT on each AMP. The sender AWT is a work00 work type and it accepts the rows from the client and redistributes them to the correct AMP based on PI value. The receiver AWT is a work01 work type and receives rows that have been sent from other AMPs, consolidates them, and writes them to disk.

 

You can assume each MultiLoad job will require two AWTs per AMP during the Acquisition phase, and one AWT per AMP during the Apply phase.

 

Sessions, Active and Inactive

 

If the number of sessions is equal to the number of AMPs in the configuration, then each AMP be associated to one of the sessions and will play an active role in accepting data from the client. If the number of sessions is less than the number of AMPs in the configuration, some AMPs will not be actively involved with a client session. Even in those cases, each AMP will still have a sender AWT set up.

 

Some of the MultiLoad sessions may show up as “inactive” through Viewpoint Query Session at any point in time. “Inactive” in this context means that the session/AMP is waiting for data from the client. “Active” means the AMP is actively processing data sent down from the client to that AMP.  A particular session may be changing between active and inactive many times during a load job, especially if a high number of sessions was specified.  A client machine (and/or the network) may not be powerful enough to keep all sessions/AMPs busy at the same time.  Because the client sends data round-robin across all sessions, a different set of sessions/AMPs is likely to be active at a given point.

 

If only a few sessions are being used by the MultiLoad job, it is more likely that all will be active most of the time. Whether sessions are active or inactive or a combination, all AMPs will hold onto their sender and receiver AWTs for the duration of the Acquisition phase.

 

What Actually Happens During the Acquisition Phase

 

When a MultiLoad job is in the Acquisition phase, these things take place:

  • All sender AWTs are initially set to the inactive state
  • The client sends multiple rows in a message to a sender AWT
  • The sender AWT that receives rows from the client performs these steps:
    • Sets its state to active
    • Unpacks the rows in the message and converts them to internal format
    • Sends converted rows to the correct AMP’s receiver AWT, based on the hash code of the row
    • Sets its state to inactive
    • Sends a response to the client and waits for the next message
  • Meanwhile, receiver AWTs process received rows independently

 

MultiLoad and MAPS

 

When the table being updated by a MultiLoad job resides in a map that covers only a subset of the AMPs in the configuration, only AMPs in that map will be used for the load activity. For example, if you are loading into a table located in TD_Map1, and TD_Map1 only covers half the AMPs in the current configuration (for example), then only the AMPs in TD_Map1 will be used to determine the default number of sessions and do the work to support the load job. Under those conditions, only the AMPs in TD_Map1 will require a sender and receiver AMP worker task to be active during the Acquisition phase.

3 Comments
Enthusiast

Hi Carrie,

 

I have a doubt here. Although this occured at the time of FASTLOAD, I would still ask the question as I am informed that Session allocation is quite same in FASTLOAD and MULTILOAD. In one of my FASTLOAD script, I defined session as 20 and number of AMPs in my system was 48. Post running, when Fastload log was generated, it showed only 5 sessions were acquired. As I have observed, the table has data over all the 48 AMPs. Is it possible that there will be any DBS control parameter or Throttle defined in TASM can override on this number of sessions?

 

My expecteation was if I don't declare any '.session' then it should acquire 48 (+2) sessions and If I declare .session 20, then it should run with 20 sessions. Please enlighten us with your thought.

 

Thanks,

Dip

Teradata Employee

Dip,

 

Starting in 13.10, control over the number of sessions used by FastLoad and MultilLoad has been moved inside the database. You can modify the session rules through Workload Designer if you do not like the defaults you are getting. When session rules are active, the session parameter on the load script is ignored.

 

The following blog posting explains how session management works.  

 

https://community.teradata.com/t5/Blog/Utility-Session-Management-It-s-Inside-the-Database-in-Terada...

 

In addition, it will help to clarify this behavior if you read the section on Utility Session Rules in the TASM orange book.

 

Thanks, -Carrie

Enthusiast

Thank you so much Carrie for this clear explanation.