Number of AMP Worker Tasks used by Archive Utility

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

Have you ever wondered how many AMP worker tasks (AWT) were actually being used during an archive or a restore?  You're not alone.  Is it one per session?  Could it be one per AMP?  Here's how it works.

Backup

For the backup part, Armain always uses 1 AWT per AMP.  If you have fewer sessions than the number of AMPs, not all AMPs will use their AWTs at the same time.  When one AMP has completed its work, a different AMP will start its part of the dump and use an AWT at that time.   Total AMP worker tasks will be 1 per AMP, but concurrent AMP worker tasks used at any point in time will equal the number of sessions.

Restore

You probably care less about AWT usage for the restore side, because it’s a less common operation.  But in case you do a restore, AMP workers tasks are used differently in the loading phase compared to the build phase of the restore operation. 

Here's how AWTs are used for the loading phase of the restore:

  • For a restore on the same configuration, there will be 1 AWT per session.
  • For restore and different configuration, there will 1 AWT per session + 3 AWT per AMP (for Teradata 13.0 or after)
  • For restore and different configuration, there will 1 AWT per session  (for Teradata 12.0 or before)

Here's how AWTs are used for the build phase of the restore:

  • For restore and same configuration, there will be 1-2 AWT per AMP per table.
  • For restore and different configuration, there will be 1-3 AWT per AMP per table (for Teradata 12.0 or before).
  • For restore and different configuration, there will be 1-2 AWT per AMP per table (for Teradata 13.0 or after).
11 Comments
Enthusiast
Carrie, i do have an interesting observation of a particular job that we had here to backup a 65TB non-fallback table. i am wondering whether you can provide more insight for me from the AWT perspective. this backup using 30 parallel streams with 20 session each on a single stream (600 sessions). the impact of performance on system is more visible at the near end of the job. the backup runs pretty fast and steady, but during the last hour and half, i believe the job's throghput is slowing down while other workload on the system experience a much more visible delay/slow execution. Can you help to explain why ? System's configuration is a conexisting 5500/5555 .
Teradata Employee
Unfortunately, I am not the right person to help you to debug archive performance. There are many factors involved when an archive is taking place, such as the distribution of AMPs (your system is coexistence), the number of sessions, the network configuration, the type of archive being used, the priority of the archive job, what else is active on the system at the same time. If there is a surge of activity from the non-archive users, this could have an impact.

I can tell you that the AMP worker tasks used by the archive will be acquired before the job gets under way, and so I don't think they will be a factor in what is happening.

There is a really good orange book called Local AMP Backup, and there is a Teradata Education Network webinar on Best Practices for BAR you could check out. Or you can post your question on the various Teradata forums to see if others with experience in this area can provide you with some ideas.

Thanks, -Carrie
Enthusiast
Carrie, thanks for the information. I know traditionally, priority scheduler does not regulate hwo the Data Session for arcmains. those data sessions were ketp knocking down by SQL sessions. After TDWM, is this still the case?

I know what our problem is today, BAr system is only connected to half of the nodes (older nodes before expansion before co-existence). I am just guessing that arcmain probably consumes data local to PE first and then the remote to PE data are consumed later. So when backup jobs ran into last portion, almost all data are from the nodes that are not directly connected. Therefore bynet was swamped. is this a illusion or not?
Teradata Employee
The database part of Arcmain has always and continues to be under the contol of priority scheduler. There used to be an issue with TASM when it first became available, where Arcmain jobs did not respect TASM classification rules. I think in that case they all ran in the $M priority.

That has since been fixed, I believe in Teradata 12.0, so you should be able to classify arcmain jobs just fine today.

hello carrie can u help me how you tell about indexs usage in realtime..wt is der purpose og use ..plz
Teradata Employee
Please see the Teradata Database Design manual, Chapter 10 and Chapter 11, for detailed information on using indexes in the Teradata database.

There is also a class offered by Teradata Education Network that discusses indexes:

http://developer.teradata.com/database/training/teradata-indexes-how-they-work-when-to-use-them

And there is an article by Alison on this topic at:

http://developer.teradata.com/database/articles/indexes-too-much-of-a-good-thing

Thanks, -Carrie
I am going to checked Chapter 10 and Chapter 11 for this.Thanks for advice.

resume
Enthusiast
Hi Carrie,

If TD database restarts normally we can check the sotware_event_log view in dbc . Is there any source to check it or can find about the restart using the logs files residing on the node.
Teradata Employee
Sorry, but I don't have a good answer for that question, as it's out of my area of experience.

As an alternative, try posting your question on the Teradata Forum or on one of the other sites where you can ask Teradata questions of other Teradata users.
Hi Carrie,

I have a question regarding PDCR jobs and thier affects on backup speeds. We have a job that archives two dbql tables from the PDCRDATA database and with the current PDCR installation with 13.02. The problem we are facing is that whilst archiving all other tables/DBs we get a speed of above 150MB/s but with this job it runs at half the speed ie around 80Mb/s. An argument is being given that a PDCR script that runs every 10 minutes causes the slow speed issue, bu that script doesnt deal with with these two specific tables so there isnt any blocking involved as well...Any suggestions to find out what the problem could be?
Teradata Employee

I am not familiar with the PDCRDATA database, so unfortunately, I can't offer any specific suggestions. Backup performance is generally influenced by the size of the data being archived and the medium to which the archive is being written. You could also look at things like network capabilities.

Thanks, -Carrie