Putting Your Penalty Box on the SLG Tier

Blog
The best minds from Teradata, our partners, and customers blog about relevant topics and features.
Teradata Employee

If you’ve got TASM for your workload management, and are demoting your very resource-intensive (potentially disruptive) queries into a penalty box workload in Timeshare Low, and that’s not working out as well as you’d like, here’s another approach to think about.  

 

Move the penalty box workload to Tier 1 of the SLG Tier, give it a very low allocation percent, and run it there all by itself.

 

Why Timeshare May Not Be Enough

 

Timeshare workloads are at the bottom of the hierarchy, and they divide up all the unused CPU that is unused higher up and flows down to the bottom level. It is unpredictable how much CPU requests running in any of the four Timeshare access levels will receive.  In combination they should receive at a minimum whatever the global weight is for the remaining control group on the lowest SLG Tier above Timeshare. But if workloads in the SLG Tier are not consuming their entire allocation, Timeshare workloads could get a lot more.

 

You can look at the global weight of your Timeshare workloads using the System Workload Report. But that only tells you the percent of CPU each defined workload will receive if all workloads are fully active, and all workloads are using their full allocation of CPU.  That is not always the case. Most SLG Tier workloads are set for peak processing. Timeshare workloads often receive more than their global weight specifies.

 

One characteristics of Timeshare that reduces its attractiveness for use as a penalty box: Higher concurrency in any Timeshare access level means a higher percent of Timeshare resources will go to that priority.  So if your penalty box in Timeshare Low has 20 queries and there is only one query each in Timeshare Top, High, and Medium, Timeshare Low will get the majority of Timeshare resources.  Concurrency boosts CPU share in Timeshare.  (Read the Priority Scheduler orange book for more detail on this point.)

 

Why the SLG Tier?

 

The SLG Tiers were intended to support higher priority workloads, the kind of work that requires a predictable level of resources and is not considered background work. You can define five SLG Tiers, but many sites use only one or two, with several workloads on each.  

 

Each SLG Tier workload is given an allocation percent. The allocation percent can be as high as 90% (not a good idea) or as low as 0.1% (for things you REALLY want to slow down).

 

There are several characteristics of an SLG Tier you need to understand that that are relevant to the situation we are discussing:

 

  1. Workloads on the SLG Tier 1 will generally receive CPU that reflects its allocation percent, while the allocation percentages on the lower SLG Tiers will be diluted by the percent given to the Remaining control group of the tier above. If you look at the System Workload Report, it will translate the SLG Tier workload percentages to a global weight number that represents the actual percent of resources that workload will be entitled to.
  2. Contrary to Timeshare, concurrency does not increase the allocated CPU. The more queries in an SLG Tier workload, the less resource each gets.
  3. If any one workload is not using its allocation, or is inactive, other workloads on the same SLG Tier can use that unused resource before the resource flows to the next lower tier.
  4. CPU that flows into Timeshare but cannot be used will flow up and be made available to the lowest SLG Tier level first. Only if workloads on the lowest SLG Tier cannot use that resource, will it flow to the next level up, and so forth.  

 

How Would a Penalty Box on the SLG Tier Work?

 

Considering the characteristics we just looked at, here’s a very “general” way you could set up a penalty box on the SLG Tier. Be warned that I am sharing a very “general” approach and it may not be a good match for your particular setup and flow of work.  It’s always recommended that you consult with your Teradata Professional Services support personnel before making any big changes to your priority scheduler setup.  And definitely try it out on your test system first.

 

Move your penalty box workload to SLG Tier 1 and give it a low allocation, 2% is a good starting point. The Remaining control group on SLG Tier 1 will be set with an allocation of 98%, indicating how much resource will flow to the next tier below.

 

If you have only been using SLG Tier 1, move all the workloads previously on SLG Tier 1 to SLG Tier 2. SLG Tier 2 will now be receiving 98% of the system resources, rather than 100%, a slight difference. If there are workloads in other lower SLG tiers, they should be moved down one SLG Tier level as well.

 

When Timeshare has unused resources they will flow to the lowest SLG Tier first, which will be SLG Tier 2 or whatever tier is carrying your other non-penalty box workloads, providing them a boost in allocation when Timeshare is less busy, but making it more difficult for the Penalty Box workload on SLG Tier from getting the same benefit.

 

Conclusion

 

Hard Limits were often needed in SLES 10 priority scheduler to strongly contain workloads such as penalty boxes.  I do not recommend that you use SLES 11 hard limits with low percentages, as would be needed in a situation such as this.  In fact you may not actually need hard limits.  The SLES 11 priority scheduler is much more effective and granular in controlling the allocation of CPU, which you can see especially when the system is busy. 

 

One last point: The Penalty-Box-on-the-SLG-Tier approach is not going to prevent increased resource being made available to penalty box queries at times when the system is being lightly used.  If there are spare resources, SLES 11 priority scheduler will attempt to ensure they are used.  This is a good thing—why throw that resource away?  What it is expected to prevent is a situation where penalty box queries take resources away from other higher priority workloads, even though those higher priority workloads are entitled to that resource based on setup definitions and are ready and able to consume it.  

4 Comments
Enthusiast

Wonderful!. i am having the same exact problem. My low tier timeshare workloads are some times taking too much cpu and concurrency and this solution might work out for me. Thanks Carrie. -Suhail

KN
Enthusiast

Thanks Carrie for the detailed info.. Infact i had moved my Penalty WL in SLG tier about an year back.. With in SLG we have tier 1 and tier 2 ..

I need to check if Penalty WL is in SLG tier2 or 1.. If i find it in 2 then i would rather move it to SLG tier 1.

 

Thanks

KN

Teradata Employee

KN,

 

Thanks for letting me know about your experiences.  I'd be interested in what the global weight (and the allocation percent) is for your penalty box workload when you look at it in the System Workload Report, if you can share that information.

 

If you end up moving it to SLG Tier 1 from SLG Tier 2, just remember to move all workloads that were in Tier 1 down to Tier 2, so the penalty box workload is all by itself in Tier1.

 

Regards, -Carrie

KN
Enthusiast

Hi Carrie,

 

Basically we have designed is on the SLG Tier1 we have interactive , webbased workloads whereas on SLG Tier 2 we have only the Penalty Box workload which is set 0.1% of the global weightage.

 

Thanks

KN