The SLES 11 priority scheduler implements priorities and assigns resources to workloads based on a tree structure. The priority administrator defines workloads in Viewpoint Workload Designer and places the workloads on one of several different available levels in this hierarchy. On some levels the admin assigns an allocation percent to the workloads, on other levels not.
How does the administrator influence who gets what? How does tier level and the presence of other workloads are on the same tier impact what resources are actually allocated? What happens when some workloads are idle and others are not?
This posting gives you a simple explanation of how resources are shared in SLES 11 priority scheduler and what happens when one or more workloads are unable to consume what they have been allocated.
Conceptually, resources flow from the top of the priority hierarchy through to the bottom. Workloads near the top of the hierarchy will be offered all the resources they are entitled to receive first. What they cannot use, or what they are not entitled to, will flow to the next level in the tree. Workloads at the bottom of the hierarchy will receive resources that either cannot be used by workloads above them, or resources that workloads above them are not entitled to.
What does “resources a workload is entitled to” mean?
Tactical is the highest level where workloads can be placed in the priority hierarchy. A workload in tactical is entitled to a lot of resources, practically all of the resources on the node if it is able to consume that much. However, tactical workloads are intended to support very short, very highly-tuned requests, such as single-AMP queries, or few-AMP queries. Tactical is automatically given a very large allocation of resources to boost its priority, so that work running there can enjoy a high level of consistency. Tactical work is expected to use only a small fraction of what it is entitled to.
If recommended design approaches have been followed, the majority of the resources that flow into the tactical level will flow down to the level below. If you are on an Active EDW platform, the next level down will be SLG Tier 1. If you are on an Appliance platform, it will be Timeshare.
SLG Tiers are intended for workloads where there is a service level goal, whose requests have an expected elapsed time and where their elapsed time is critical to the business. Up to five SLG Tiers may be defined, although one, or maybe two, are likely to be adequate for most sites. Multiple workloads may be placed on each SLG Tier. The figure below shows an example of what SLG Tier 1 might look like.
In looking back at priority hierarchy figure, shown first, note that the tactical tier and each SLG Tier include a workload labeled “Remaining”. That workload is created internally by priority scheduler. It doesn’t have any tasks or use any resources. Its purpose is to connect to and act as a parent to the children in the tier below. The Remaining workload passes unused or unallocated resources from one tier to another.
The administrator assigns an allocation percent to each user-defined workload on an SLG Tier. This allocation represents a percent of resources the workload is entitled to from among the resources that flow into the tier. If 80% of the node resources flow into SLG Tier 1, the Dashboard workload (which has been assigned an allocation of 15%) is entitled to 12% of the node resources (80% of 15% = 12%).
The Remaining workload on an SLG tier is automatically assigned an allocation that is derived by summing all the user-defined workload allocations on that tier and subtracting that sum from 100%. Remaining in the figure above gets an allocation of 70% because 100% - (15% + 10% + 5%) = 70%. Remaining’s allocation of 70% represents the percent of the resources that flow into SLG Tier 1 that the tiers below are entitled to. You will be forced by Workload Designer to always leave some small percent to Remaining on an SLG Tier so work below will never be in danger of starving.
An assigned allocation percent could end up providing a larger level of node resources than a workload ever needs. Dashboard may only ever consume 10% of node resources at peak processing times. Or there may be times of day when Dashboard is not active. In either of those cases, unused resources that were allocated to one workload will be shared by the other user-defined workloads on that tier, based on their percentages. This is illustrated in the figure below.
Note that what the Remaining workload is entitled to remains the same. The result of Dashboard being idle is that WebApp1 and WebApp2 receive higher run-time allocations. Only if the two of them are not able to use that spare resource will it go to Remaining and flow down to the tiers below.
Unused resources on a tier are offered to sibling workloads (workloads on the same tier) first. What is offered to each is based on the ratio of their individual workload allocations. WebApp1 gets offered twice as much unused resource originally intended for Dashboard as WebApp2, because WebApp1 has twice as large a defined allocation.
Priority scheduler uses the same approach to sharing unused resources if the tiers below cannot use what flows to them. The backflow that comes to an SLG tier from the tier below will be offered to all active workloads on the tier, proportional to their allocations. However, this situation would only occur if Timeshare workloads were not able to consume the resources that flowed down to them. All resources flow down to the base of the hierarchy first. Only if they cannot be used by the workloads at the base will they be available to other workloads to consume. Just as in SLES 10 priority scheduler, no resource is wasted as long as someone is able to use it.
Timeshare is a single level in the hierarchy that is expected to support the majority of the work running on a Teradata platform. The administrator selects one of four access levels when a workload is assigned to Timeshare: Top, High, Medium and Low. The access level determines the level of resources that will be assigned to work running in that access level's workloads. Each access level comes with an access rate that determines the actual contrast in priority among work running in Timeshare. Top has an access rate of 8, High 4, Medium 2 or Low 1. Access rates cannot be altered.
Priority Scheduler tells the operating system to allocate resources to the different Timeshare requests based on the access rates of the workload they have classified to. This happens in such a way that any Top query will always receive eight times the resources as any Low query, and four times the resource of any Medium query, and two times the resource of any High query.
This contrast in resource allocation is maintained among queries within Timeshare no matter how many are running in each access level. If there are four queries running in Top, each will get 8 times the resource of a single query in Low. If there are 20 queries in Top, each will get 8 times the resource of a single query in Low. In this way, high concurrency in one access level will not dilute the priority differences among queries active in different access levels at the same time.
When using SLES 11 priority scheduler, the administrator can influence the level of resources assigned to various workloads by several means. The tier (or level) in the priority hierarchy where a workload is placed will identify its general priority. If a workload is placed in the SLG Tier, the highest SLG tier will be offer a more predictable level of resources, compared to the lowest SLG Tier.
The allocation percent given to SLG Tier workloads will determine the minimum percent those workloads will be offered. How many other workloads are defined on the same SLG Tier and their patterns of activity and inactivity can tell you whether sibling sharing will enable a workload to receive more than its defined allocation.
Workloads placed in the Timeshare level may end up with the least predictable stream of resources, especially on a platform that supports SLG Tiers that use more at some times and less at others. This is by design, because Timeshare work is intended to be less critical and not generally associated with service levels. When there is low activity above Timeshare in the hierarchy, more unused resources will flow into Timeshare workloads. But if all workloads above Timeshare are consuming 100% of their allocations, Timeshare will get less.
However, there is always an expected minimum amount of resources you can count on Timeshare receiving. This can be determined by looking at the allocation percent of the Remaining workload in the tier just above. That Remaining workload is the parent of all activity that runs in Timeshare, so whatever is allocated to that Remaining will be shared across Timeshare requests.
You can route more resources to Timeshare, should you need to do that, by ensuring that the SLG Tier Remaining workloads that are in the parent chain above Timeshare in the tree have adequate allocations associated with them. (To accomplish this you may need to reduce some of the allocation percentages of the user-defined workloads on the various SLG Tiers.) What Timeshare is entitled to, based on the Remaining workloads above it, is honored by the SLES 11 priority scheduler in the same way as the allocation of any other component higher up in the tree is honored. But since Timeshare is able to get all the unused resources that no one else can use, it is likely that Timeshare workloads will receive much more than they are entitled to most of the time.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.