This talks about the use of the Data Warehouse Query Driver TdBench in relation to:
Where to Get it: TdBench V7.2 runs under Linux and has a version that includes TPC-DS and TPC-H benchmarks. There is also a downloadable Excel Reporting template and a set of 4 videos demonstrating its features at:http://downloads.teradata.com/download/tools/tdbench7-2
Often when people hear the term “Benchmark”, they associate it with a “speeds & feeds” contest between 2 or more vendors to win the business from a company.
TdBench was developed to enable our benchmark analysts to be more productive in simulating legacy environments, from coordinating multiple streams of queries with different SLA’s to injecting data loading activities or other events into the test scenarios. Queries and ETL activities can be coordinated within the test and multiple tests can be put into a single TdBench script with Linux commands between to manipulate files, views and tables for the next test. TdBench:
It automatically tracks interfering workload, TASM rules in effect, Teradata release, and node/amp configuration. When it comes time to pull results for the presentation, the TdBench test metadata helps to select the error-free formal test runs from the dozens (or hundreds) of tests that may have been executed.
Sometimes, a company is building a new application, so instead of converting legacy, a new data model is still evolving and queries are just being constructed. Then the "Benchmark" is executed within the “Proof of Concept”. This may be competitive or non-competitive. As part of the development activity, a query driver such as TdBench is valuable for tracking the changes in performance of various queries and workloads as the data model evolves. Descriptions of the run conditions and observations can be recorded into each test’s metadata. The TdBench linkage to DBQL and PDCR allows investigation of performance changes which sometimes are a result of using the wrong views or tables, or forgetting to recreate an AJI after a PDM redesign. Each query can be labeled with comments in the SQL or query band and using the TdBench analysis views, you can investigate:
An increasing number of customers are asking for Migration Benchmarks to ensure:
For these benchmarks, queries and other processing is selected from one or more critical periods of production. Data is migrated to the target platform and the queries are rerun. Since these benchmarks typically use live or copies of current data, they are a “one time use” benchmark since queries reference dates and keys that may not exist in the future.
TdBench analysis tools can be used to compare overall performance, identify queries that have changed the most based on run time, CPU, I/O, or parsing time, and drill down to look at the query execution steps, and objects reference on each platform to determine the cause of the change. TdBench runs on Linux and can define multiple DBMS platforms. The same test definitions can be run on each platform. The TdBench scripts can even run a set of tests on one server, then use the "SERVER' command to switch platforms run the same tests on the other.
One of the concerns that operations has when software is upgraded or maintained is whether the changes will have adverse impact on critical SLA’s. For example, as Teradata’s optimizer has evolved, each release has delivered higher performance for the overall workload. There are situations where a few queries perform worse. This could be because some statistics were not collected and the prior release’s optimizer fell into a “lucky guess”, or because the optimizer is now spending more time optimizing a short running complex SQL query than it took to run on the prior release. Unfortunately, sometimes there are bugs which affect some creative use of Teradata features that affect a few customers but not others.
Teradata is moving toward quarterly software releases which may make it more critical to quickly and productively assess the impact of moving to a new software release. For this type of benchmark, there may be multiple platforms such as a large production machine, a machine with deep data history, and one or more test and development platforms. Since this will be run multiple times on different sized platforms, a different approach to development is required.
The tests can initially be run on a test and development machine before and after maintenance events to validate the software release with the customer’s specific workload. The tests can be run again on a production machine during a maintenance window before and after restarting the system with the new release to validate that the performance on that platform is suitable. TdBench’s analysis tools allow you to make a quick Go/No Go decision on leaving the new release in production. The tools can then be used identify queries that have changed the most based on run time, CPU, I/O, or parsing time, and drill down to look at the query execution steps, and objects referenced.
A very simple benchmark that many shops run from Viewpoint is the “Canary Query” which should be a light-weight query to validate overall system availability and performance. At one customer site, we found that as part of their set of “Canary Queries”, they were also testing out the performance of their applications to detect changes based on data growth, updates to applications, or changes to the PDM. We found that they were using 14% of their system capacity with these queries running every 15 minutes.
To reduce impact to their users, we recommended splitting out the Application Testing and only run very light weight queries from Viewpoint. The Canary Queries are best executed by Viewpoint and it has great portlets for analyzing changes in performance.
We recommended to the above company to create an “Application Benchmark” to run on weekends with those queries designed to assess the performance impacts of data growth, application query changes and PDM changes.
This is a good candidate for running under TdBench because it assigns a RunID for each execution, and its analysis views are able to parse out query names from comments in the query text or from a query band to trace the performance changes of the test workload or of each query over time. This type of test runs selection queries against live data, but executes data modifying queries against copies of data or volatile tables.
As the Data Warehouse evolves to support a larger and more complex workload and one which is critical to the business, it is important that necessary maintenance and upgrade activities do not impact the user community. Benchmarking is an important tool to assess the impact of each change. But it isn't enough to clock a thousand queries before and after and look at the total minutes. The benchmarking approach needs to assume that you will need to get down to looking at a specific query or even a single step to recognize that an AJI wasn't there or that statistics were needed on a particular column among thousands of columns for the new release.
DBQL, Resusage, and Viewpoint are valuable tools for capturing data about system performance. PDCR and Viewpoint are great tools for looking at system growth and historical performance anomalies. TdBench is a tool that:
TdBench is available for download at:http://downloads.teradata.com/download/tools/tdbench7-2
The benchmark is currently used by the world largest "New Money" company to evaluate Teradata 15.10 upgrade impact. It's a game changer in evaluating upgrade risk, impact & regressions.
Every Teradata site shall consider deploying this great tool!