This article describes how Database Administrators can use Teradata Active System Management (TASM) to manage queries that execute against tables in Teradata Data Labs. This allows analysts to obtain the information they need without negatively impacting production applications.
Although this article focuses on TASM management, the general workload management guidelines can apply to other Teradata platform workload management strategies, not only ones running TASM.
A data lab is a separate dedicated space within a production data warehouse for agile development of new analytic queries that can combine personal, ad hoc, or temporary data with production data. The Teradata Data Lab product provides Viewpoint portlets to assist and automate the operations in doing this "sandboxing" in production and is intended for use by all the users of data labs.
Refer to the Teradata Data Lab release article for more information on the Teradata Data Lab product.
By its nature as an environment that fosters quick, agile development of new queries and new data, Data Labs provides the analyst with the opportunity to gain new insight. While these insights may yield great benefits for the corporation, it is up to the DBA to make sure that these untested queries that may be running against skewed lab tables do not negatively impact production operation of the data warehouse. TASM provides the DBA with the tools to balance the needs of the Data Lab Analyst and the normal production workloads.
Before you can manage a Data Lab, you need to know how Data Labs works under the covers. The following diagram helps explain some of the new concepts in Teradata Data Labs.
As a DBA, you will likely be responsible for creating a few Lab Groups. A Lab Group is a container for one or more labs. For example, you might create a Lab Group named
Finance for use by the finance department. When you create this lab group, you will be prompted for a prefix name. This is the key to creating TASM rules. In the Lab Group Setup screen shot below, you will see that we have created a prefix of
findpt. This prefix will be prepended to the name of every lab that is created in this lab group. For example, if a user creates a lab named
FY2011, then the Data Labs portlet will create a corresponding database named
findptFY2011. This will be the database that contains all of the tables created in this lab.
Since we know the prefix that will be applied to all databases in a lab group, we can create TASM classification criteria that will match all queries that target tables in this database. If you are using Teradata 13.0 or higher this will be easy because classification criteria can contain wildcards. For previous releases, you will have to create a classification criteria for each Data Lab.
Modification of your TASM ruleset should not be taken lightly. TASM controls how resources are applied to all your workloads. If you are not comfortable with using TASM, you may want to consider contacting Teradata Professional Services. They have people who are experts in Data Labs and TASM that can provide analysis, advice, and training specific to your needs.
In order to know where TASM rules are needed, we need to collect data about how the Data Labs are being used. We can do that by enabling logging of all queries to the Database Query Log (DBQL). The Performance Management Guide describes how to enable and utilize the Query Log. After query logging has been enabled for a week or two, you can use the Viewpoint Query Spotlight portlet to determine if there are specific users or tables that might need special attention. In the meantime, let’s take a look at how to create a workload restricts the impact of queries running in our Finance Lab Group.
When working with TASM, we always want to be able to revert our changes if they cause unexpected resource shifts. So we will start by cloning the active ruleset. If the current active ruleset is
Production.v6, then we would choose
Clone from the drop down menu associated with
Production.v6. In the new cloned ruleset, change the ruleset name to
In order to prevent Data Labs queries from impacting your production workloads, it may be necessary to have them run at a lower priority. To do this we need to create a new workload. In Workload Designer, click on your cloned ruleset and navigate to the Workloads page. Then click on the
Create Workload button to create a workload.
Give the workload an appropriate name and description. Then choose an Enforcement Priority. In this example, the
Background priority has been selected so that queries that run in this workload will run at the lowest default priority.
Next, we need to define a classification criteria that will match queries running against the tables in the Finance Lab Group. Click on the
Classification tab, set the criteria type to
Target, and click
Add. The following dialog will be displayed.
We want our workload to match all queries running against tables in Data Labs that are members of the Finance Lab Group, so we create a classification criteria that matches all queries that target a table in a Data Lab database whose name starts with
Congratulations, you have defined a workload for the Finance Lab Group. To put your changes into effect, navigate up to the top page in the Workload Designer portlet and choose
Make Active from the drop down menu associated with your new cloned ruleset. Once the ruleset is activated, all queries that match the classification criteria will run at the priority associated with the Background Enforcement Priority.
Note: if you find that some queries that run against Data Labs in the Finance Lab Group are being classified to other workloads, you may need to adjust the Workload Evaluation Order. To do this, navigate to the
Workloads page, select the
Evaluation Order tab, and move your Lab Group workload up or down as needed.
The next step that you might want to take is to place a limit on the number of concurrent queries that can execute against the tables in a lab group. This will help to even out the load on the system. For example, if one of your users has several very long queries to run and decides to start them all before leaving for lunch, a TASM workload throttle can reduce the impact on other workloads by preventing them from all running at the same time.
Let’s create a throttle for the Finance lab group. Navigate back to the
Workloads page and click on your workload’s name to drill down to the detail page. Click on the
Throttles tab to configure a throttle.
In this example, we have limited the number of concurrent queries that can run in this workload to two. In addition, when the system is in the
Critical state, only one concurrent query will be permitted.
That’s all you need to do to create a throttle. You’ll need to activate your ruleset again before the throttle will take effect.
By combining the simplified sandbox management provided by Teradata Data Lab with the workload management features of TASM and Workload Designer, we safely permit analysts to extract additional value from production data by using new queries and combining production data with data in the data lab.