If you’ve got TASM for your workload management, and are demoting your very resource-intensive (potentially disruptive) queries into a penalty box workload in Timeshare Low, and that’s not working out as well as you’d like, here’s another approach to think about.
Move the penalty box workload to Tier 1 of the SLG Tier, give it a very low allocation percent, and run it there all by itself.
Why Timeshare May Not Be Enough
Timeshare workloads are at the bottom of the hierarchy, and they divide up all the unused CPU that is unused higher up and flows down to the bottom level. It is unpredictable how much CPU requests running in any of the four Timeshare access levels will receive. In combination they should receive at a minimum whatever the global weight is for the remaining control group on the lowest SLG Tier above Timeshare. But if workloads in the SLG Tier are not consuming their entire allocation, Timeshare workloads could get a lot more.
You can look at the global weight of your Timeshare workloads using the System Workload Report. But that only tells you the percent of CPU each defined workload will receive if all workloads are fully active, and all workloads are using their full allocation of CPU. That is not always the case. Most SLG Tier workloads are set for peak processing. Timeshare workloads often receive more than their global weight specifies.
One characteristics of Timeshare that reduces its attractiveness for use as a penalty box: Higher concurrency in any Timeshare access level means a higher percent of Timeshare resources will go to that priority. So if your penalty box in Timeshare Low has 20 queries and there is only one query each in Timeshare Top, High, and Medium, Timeshare Low will get the majority of Timeshare resources. Concurrency boosts CPU share in Timeshare. (Read the Priority Scheduler orange book for more detail on this point.)
Why the SLG Tier?
The SLG Tiers were intended to support higher priority workloads, the kind of work that requires a predictable level of resources and is not considered background work. You can define five SLG Tiers, but many sites use only one or two, with several workloads on each.
Each SLG Tier workload is given an allocation percent. The allocation percent can be as high as 90% (not a good idea) or as low as 0.1% (for things you REALLY want to slow down).
There are several characteristics of an SLG Tier you need to understand that that are relevant to the situation we are discussing:
How Would a Penalty Box on the SLG Tier Work?
Considering the characteristics we just looked at, here’s a very “general” way you could set up a penalty box on the SLG Tier. Be warned that I am sharing a very “general” approach and it may not be a good match for your particular setup and flow of work. It’s always recommended that you consult with your Teradata Professional Services support personnel before making any big changes to your priority scheduler setup. And definitely try it out on your test system first.
Move your penalty box workload to SLG Tier 1 and give it a low allocation, 2% is a good starting point. The Remaining control group on SLG Tier 1 will be set with an allocation of 98%, indicating how much resource will flow to the next tier below.
If you have only been using SLG Tier 1, move all the workloads previously on SLG Tier 1 to SLG Tier 2. SLG Tier 2 will now be receiving 98% of the system resources, rather than 100%, a slight difference. If there are workloads in other lower SLG tiers, they should be moved down one SLG Tier level as well.
When Timeshare has unused resources they will flow to the lowest SLG Tier first, which will be SLG Tier 2 or whatever tier is carrying your other non-penalty box workloads, providing them a boost in allocation when Timeshare is less busy, but making it more difficult for the Penalty Box workload on SLG Tier from getting the same benefit.
Conclusion
Hard Limits were often needed in SLES 10 priority scheduler to strongly contain workloads such as penalty boxes. I do not recommend that you use SLES 11 hard limits with low percentages, as would be needed in a situation such as this. In fact you may not actually need hard limits. The SLES 11 priority scheduler is much more effective and granular in controlling the allocation of CPU, which you can see especially when the system is busy.
One last point: The Penalty-Box-on-the-SLG-Tier approach is not going to prevent increased resource being made available to penalty box queries at times when the system is being lightly used. If there are spare resources, SLES 11 priority scheduler will attempt to ensure they are used. This is a good thing—why throw that resource away? What it is expected to prevent is a situation where penalty box queries take resources away from other higher priority workloads, even though those higher priority workloads are entitled to that resource based on setup definitions and are ready and able to consume it.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.