Query Driver setup and use

Extensibility
Extensibility covers the mechanisms by which you, as the user or developer, can extend the functionality of the Teradata Database, for example with the use of User Defined Functions, or UDFs.
Teradata Employee

Query Driver setup and use

New Release!!

TdBench V7.2 is now available for Linux at http://downloads.teradata.com/download/tools/tdbench7-2

TdBench V5 for MS-DOS Installation

This section describes how to setup and use the Teradata Benchmark Query Driver. The setup activities are:

  1. Configure Teradata with a benchmark control user or database with appropriate rights
  2. Unzip TdBenchV5 files
  3. Customize TdBenchProfile.bat for your benchmark
  4. Run TdBenchSetup.bat to create worker logonids, tables, views and macros
  5. Convert benchmark queries to macros
  6. Use TdBench.bat or TdBench_Window.bat + TdBenchRun.bat to run tests
  7. Using views to report results
  8. Construct mixed workload tests using .lst files
  9. Construct additional tests using .btq files for testing under BTEQ
  10. Construct additional tests using .bat files

1: Configure Teradata

A benchmark control logon ID will need to be created with 1 GB of current perm and an amount of spool and temp that will allow you to execute any of the queries in the benchmark. 4 TB should be sufficient during the benchmark. (If running the demo version of TD under VMware, see Appendix B for considerations). The logon name should be suffixed with the string "_Benchmark" since this is used as a parsing marker in some of the reporting views. Example of a name: Demo_Benchmark. You may have benchmarks for different purposes, so you could create a separate xxx_Benchmark database for each.

Hint: If you aren’t familiar with granting rights, do what you can at proceed to Step 4 and it will create the statements you will need to execute or get a system administrator to execute logged on as DBC or equivalent.

The benchmark control user needs the following rights:

  • Simply grant every right on benchmark control user (a.k.a. benchmark database for the demo version of Teradata) to the benchmark control user with grant option. The specific rights needed:
    • CREATE/DROP USER on itself (to allow it to define workers)
    • EXECUTE PROCEDURE, CREATE PROCEDURE, DROP PROCEDURE on itself
    • SELECT WITH GRANT ACCESS on itself to allow views to be created
  • Grant access to the benchmark control user to others who will be controlling the benchmark. You could easily grant full control to those users so they could create views, macros, and additional benchmark workers later. At a minimum:
    • EXECUTE, EXECUTE PROCEDURE to those that will be controlling the benchmark
    • SELECT to those that will be analyzing the benchmark results.
  • Grant the benchmark control user (and benchmark database for demo version of Teradata) access to system and application resources:
    • SELECT, EXECUTE WITH GRANT ACCESS on DBC (to support reporting views and procedure to toggle dbql)
    • EXECUTE FUNCTION WITH GRANT ACCESS on SYSLIB (to support stored procedures that cancel workers)
    • MONRESOURCE, MONSESSION, ABORTSESSION, SETSESSRATE, SETRESRAT (to manage logging)
    • SELECT WITH GRANT ACCESS to all application databases included in benchmark (to allow macros in the control user to access application database content).

Bottom Line: It is highly recommended that all DBQL tracking is turned on for every user with the possible exception of tests using TPump where summary mode may be appropriate for the user performing that update, and then, only turning on summary after you have gotten an initial check of execution.

The setup in Step 4 will put views in the xxx_Benchmark database for reporting on:

  • the detailed query executions for a RunID (RptTestDetail)
  • a summary of average and maximum executions of queries (RptTestSummary), and
  • an overview of all tests in the benchmark including execution counts and errors (RptBenchDetail).

You should also use CTL to turn on ResUsage logging. There are views defined in the xxx_Benchmark database for:

  • ResUsageSawt (RptSawt) – Data specific to AMP worker tasks. Use this when you want to monitor the utilization of the AWTs and determine if work is backing up because the AWTs are all being used.
  • ResUsageSpma (RptSpma) -  System-wide node information provides a summary of overall system utilization incorporating the essential information from most of the other tables.
  • ResUsageSvpr (RptVproc) - Data specific to each virtual processor and its file system.

Data can be extracted from these tables using the views by specifying the RunID instead of having to provide specific ranges for TheDate and TheTime. It is highly recommended that ResUsageSpma be turned on as a minimum with a collection and logging interval of 30 seconds to provide sufficient granularity of system load detail while individual queries are executing.

Finally, all of the nodes on your Teradata platform should have their clocks synchronized. If not, reporting and analysis will be difficult or nearly impossible. The start/stop boundaries for a test maintained in the TestTracking table will miss sessions that initiated on other nodes and logged a different time in DBQL for queries.  To validate, log onto Linux on one of the nodes and issue:

psh date

That should indicate all times on all nodes are the same. If not, you can either run the script /tmp1/ntp.setup on the control node to cause the nodes to synchronize their time to the SWS or AWS for your system. It can also be synchronized manually. When the system is relatively idle and there are no applications relying on the database clock running, you can issue:

psh date +%T –s “hh:mm:ss”

Note: a change like this may cause an anomaly in the ResUsage reporting and should be noted to your performance analyst. 

2: Unzip TdBenchV5 files in Windows

When you unzip the zip file, there will be a TdBenchV5 directory created containing a setup subdirectory. All of the batch scripts needed to run the query driver are located in the TdBenchV5 directory. Other directories will be created when you run setup. 

3: Customize TdBenchProfile.bat for your benchmark

You will need to edit the TdBenchProfile.Bat file with information about the server, databases, control user, passwords, and other options. The script contains instructions for modifications, however the following should be noted:

  • The windows set command assigns values to variables as long as there are no spaces surrounding the = sign.
    • The following is valid: Set TdBenchPwd=secret
    • The following is not valid: Set TdBenchPwd  =  secret
  • The password for the control user will also be used for all of the worker sessions.
  • The control user (TdBenchUser) would typically be the same as the benchmark database (TdBenchDb) when running on a non-VMware version of Teradata. On VMware, the control user must be dbc and the benchmark database something else, e.g. Demo_Benchmark. (See Appendix B for details of running under VMware)
  • The creation of worker Logon ID's is controlled by TdBenchUserPrefix and by TdBenchUserTypes. See description below on Using Worker Queues.
  • When lots of sessions are logged on simultaneously, there can be an unusual workload created on the system. The Control user can be told to delay the start of the test for an interval to give time for worker sessions to logon. At the end of that interval it will populate the queue table(s) for the test and all workers will start running at the same time. The TdBenchStartDelay can be set to a number of seconds needed for all workers to logon.

Use of Worker Queues

This query driver supports 2 modes of operation:

  • Worker Queue: Multiple worker sessions pull the next SQL command to run from a queue table. This has low overhead (less than .01 second of cpu) and easy, flexible operation allowing the number of worker sessions to be varied without having to change scripts. With queue tables, when you initiate a test, if you specify how long the test should run, a command will be added to each queue table to repopulate them so the workers keep working until the control user aborts their session. However, the delay of waiting for the next command from Teradata and the additional processing load on the client server adds about 1/4 second between query executions meaning that very short queries being counted in a fixed period test will have an impact in the number of query completions within the period. There are a number of options with the queues:
    • One queue table with multiple workers. The order of the queries can be designed so that the initiation of queries mimics a production environment. On one benchmark, an application was written to put queries into the queue at the same relative time they occurred in production.
    • Named queue tables associated with different classes of workers.  For example: you might establish a queue for light, medium and heavy queries, and set a number of workers per queue that represent different assumptions on the number in your production environment
    • One queue table for each session.  This is slightly more laborious to set up, but has been done when trying to convert sets of scripts from other benchmarks.
  • BTEQ Scripts: Multiple scripts are created for multiple bteq sessions. You may need to use this if you have a Sql script consisting of multiple DDL statements such as create table statements for intermediate results in a reporting job. (However, you could convert such a script to a stored procedure).  This mode is recommended for fixed period tests with very short queries (< 1-2 seconds) or in tests with a large number of concurrent sessions. Select this mode if you plan to have more than 30 concurrent sessions per CPU core running queries under 5 seconds.. Unfortunately, with this mode, building the scripts will require you to repeat the sql command 10's, 100's or 1000's of times to keep a concurrency test running for a fixed period of time. However, during execution, the client PC load consists of reading a line and executing it.

The TdBenchSetup will use the settings of TdBenchUserPrefix and TdBenchUserTypes to generate worker sessions and queue tables. This will establish worker LogonIDs that can be used in either mode, and the generated queue tables may be helpful in early testing if you eventually need to move to BTEQ Scripts mode. Example of parameters:

   set TdBenchUserPrefix=Demo
set TdBenchUserTypes=Hvy:2 Med:5 Tac:10

would define Demo_Hvy001, Demo_Hvy002, Demo_Med001 to Demo_Med005, and 10 tactical logon IDs. 

4: Run TdBenchSetup.bat

After you have edited TdBenchProfile.bat and saved it, this can be run either by double clicking it from Windows Exporer or double clicking the TdBench_Window and issuing the command: TdBenchSetup

This command does the following:

  • Validates logon and password for the control user
  • Validates the rights for the benchmark Database, DBC, syslib, and your application databases. If there are missing rights, the necessary statements you need to execute will be displayed and saved to a file.
  • Determines the spool and temp of the benchmark database. All workers get this same allocation.
  • Creates worker logonids with the TdBenchPwd and same spool and temp of the control user
  • Creates subdirectories under TdBenchV5.

The directories under TdBenchV5 are:

  • Queries - put queries here, one per file with a filename appropriate as a macro name
  • Scripts - put .sql, .btq, .bat, .lst files here that define each test.
  • Logs - will contain one directory for each run's output files
  • Setup - Contains item(s) used by TdBenchSetup.bat.

Testing TdBench installation

You can then test out the query driver by running TdBench.bat and answering the questions as follows:

  • It will ask for the number of the file to run, Sample.SQL should be: 1  … and press Enter
  • It will ask for the number of worker sessions, type: 1 … and press Enter
  • Enter all or part of the logonID to use: just press Enter to use the default
  • Enter the tile of this run, say "Initial test run of tdbench" … and press Enter
  • Specify run time in seconds…, type: 0  … and press Enter (for unlimited run)

Several additional DOS windows will open, one for a control session that will open briefly and one for the worker session, the worker will complete the sample.sql and logoff. The output from the test will in a subdirectory under the Logs directory.

You can then run a short fixed period test with a number of workers, using the Sample.SQL selection as above, this time answer "2" for the number of workers and for the time in seconds, enter “10”. You will see 2 worker sessions logon in addition to the control user. The control user will sleep for 10 seconds. In addition to the Sample.SQL statements, periodically you will see a worker session repopulate the queue table. At the end of the test, the control user will force off the workers. The output from the test will be in another subdirectory under the Logs directory.

5: Convert Benchmark Queries to Macros

It is much easier to analyze repeated query executions in DBQL if each query is given a unique identifier. The usage of the queue tables limits SQL to single line SQL statements and has been arbitrarily limited to 4000 bytes. The batch script TdBenchMakeMacros.bat will create a macro out of each file, using the file name as the macro name and will create a starting script in the scripts subdirectory that will execute all macros. Note that each file must contain valid queries (e.g. no create table statements).

To run, double click on TdBench_Window to bring up a command window and issue:

TdBenchMakeMacros queries\*.txt

... if you used .txt for the file suffix, or any other suffix you prefer. The search for files will remove the file extension when the macro name is created. This will also create the files: logs\MakeMacros_Vn.log and scripts\TestAll_Vn.sql where "n" is incremented each time the script is run. Note that it creates the macros with the REPLACE DDL command, so it isn't maintaining online versions of the macros.

WARNING: if you make changes to macros directly to Teradata and then rerun this batch command, it will eliminate your changes.

You may want to change macros to have parameters for dates, products, customers, districts, etc so that when executed repeatedly, different parameters can be used to minimize database caching of data pages. 

6: Using TdBench.bat or TdBench_Window.bat + TdBenchRun.bat To Run Tests

The tests are run under the control of Windows batch scripts in a DOS command window. There are 2 ways to run a test:

  • TdBench.bat provides a simple prompted dialog for selecting the test to run, number of sessions, and the logon Id name or name pattern (e.g. Demo_Med%) to be used for the test. You may double click TdBench.bat under Windows Explorer to execute it. After it prompts for the parameters, it will run TdBenchRun.bat.
  • TdBenchRun.bat may be executed directly by first executing TdBench_Window.bat by double clicking from Windows Explorer and then issuing the TdBenchRun command with at least the name of the script file to be used. Example: TdBenchRun scripts\TestAll_V2.sql

    Up to 5 parameters may be provided for TdBenchRun:
    1. Script name. Don’t forget to include the script\ subdirectory
    2. Number of sessions, with the default being 1
    3. Logon ID name or Name Pattern (e.g. Demo_Hvy%); default is the control user
    4. Queue table name, default is QueryQueue.
    5. Repeat count – used only for SQL files to specify multiple copies of the file in the queue. Default is 1. Note: if you put a .run file= as the command in the queue and use the repeat, BTEQ will only repeat the first SQL command in the .run file and then execute the balance of the file only once.

Examples:

TdBenchRun  Scripts\TestAll_V2.sql
# Run above script in 1 session under the benchmark control user

TdBenchRun Scripts\TestAll_V2.sql 5
# Run above script in 5 sessions under the benchmark control user

TdBenchRun Scripts\TestAll_V2.sql 10 Demo_Tac%
# Run above script in 10 sessions with the users beginning with Demo_Tac

For either execution method, if the number of sessions is greater than 1 and a single logon ID is specified, then that logon ID will be used to create the number of sessions specified. If a Logon Pattern is given (ending with %, such as Demo_Tac%) the ordered list of qualifying logon ID’s will be re-used as many times as it takes to get the requested sessions started.

For either execution method, you will be asked to provide:

  1. A descriptive title of the conditions of the run (e.g. you just collected statistics, added an index, put on a different set of workload management settings, or most often, that is a rerun after fixing …)
  2. The duration of the test in seconds. If you want the test to run forever, enter zero. Otherwise enter the number of seconds for the test (e.g. 600 for 10 minutes, 1800 for 30 minutes, 3600 for 1 hour, etc). This will cause the control session to sleep for that period of time, then update the TestTracking table with the end of test and cancel all sessions associated with the test. When you enter zero for the duration, each worker session needs to update the end of test in TestTracking table with the current timestamp such that the last one will provide the real end of test.

After you enter both pieces of information, the query driver will

  1. Logon to Teradata once to determine logon IDs, clean out the QueryQueue table(s), populate QueryHold table which is used to hold queries in the database for populating/repopulating QueryQueue table(s), and get the next available RunID.
  2. Initiate a control session in a separate DOS window that will wait for the specified TdBenchStartDelay, update the TestTracking table with the new RunID, description, and number of sessions, and populate the queue tables from QueueHold table.  If you specified a number of seconds for the test greater than zero, it will sleep for that number of seconds, then update end of test in the TestTracking table and force off all sessions tracked in TestJobTracking table.
  3. After the control session is initiated, all worker sessions are logged on which will show up as separate DOS window(s).
    • If you are using .sql script files, those worker sessions will insert their SessionID into the TestJobTracking, then put a read up against the QueueTable associated with that group of sessions. If you specified TdBenchMonitor=yes in the profile, you will see as each command starts executing. You may want to reduce overhead by turning off the monitoring of statements. When a fixed work test is run, a number of .QUIT statements will be put into the queue to cause all sessions to logoff when they are done.
    • If you are using .btq script files, the logon will be executed by the Query driver based on the user name or user name pattern specified. The top of the script should contain:

      .set retlimit 10
      .set retcancel on
      database xxxxxxx_Benchmark;
      exec TdBenchLogon;

      to limit output and to insert their session ID into the TestJobTracking, and if you are going to specify zero for the run duration (unlimited time), the final statement in the BTEQ script should be:

      exec TdBenchStop;
      .quit

      to record that session’s version of when the whole test ended. While you may specify multiple sessions, each will run the same script but may have different logonids and each instance will create its own DOS window.
    • If you are using .bat script files, only 1 session is started. To record the end of the test, put the following into your batch file:

      echo exec %TdBenchDB%.tdbenchstop; | bteq .logon %TdBenchServer%/%TdBenchUser%,%TdBenchPwd%; > %2 2>>&1

    • If you are using .lst parameter files, each line may initiate .sql, .btq, or .bat files as described above. (See later section on mixed workloads)

7: Using Views to Report on Test Results

The Query Driver maintains a table called TestTracking in the benchmark database which contains starting and ending timestamps for each run. This table is used by various reporting views to select DBQL and Resusage data for a given RunId. DBQL only writes data to disk by default every 10 minutes. The DBQLFLUSHRATE can be changed in DBSCONTROL, but it is easier to turn logging off and on for one user or all users to force data to disk. This can be done under Teradata Administrator.

As alternative, a stored procedure is provided in the Benchmark database named LogToggleProc. This will issue:

End Query Logging on xxx_Benchmark;
Begin Query Logging on xxx_Benchmark with All;

Benchmark Reports from DBQL

Detailed Query Execution:  shows each statement execution

Select * from RptTestDetail where RunId = nnn order by starttime;

Summary Report: shows count, average/max run time, cpu usage by query

Select * from RptTestSummary where RunId = nnn order by stmt;

Benchmark Summary: shows count of queries, executions, cpu by runid and error code

Select * from RptBenchSummary order by RunId;
-- or
Select * from RptBenchSummary order by RunId where ErrorCode in (0, 3158);

Benchmark Reports from Resusage

Summary of Node Statistics: from ResuageSPMA – 1 minute before to 1 after test

Select * from RptSPMA where RunId = nnn order by thetime, nodeid;

Summary of Amp Worker Tasks: from ResusageSAWT - 1 minute before to 1 after test

Select * from RptSAWT where RunId = nnn order by thetime, nodeid;

Summary of VPROC activity: from ResusageSVPR - 1 minute before to 1 after test

Select * from RptVPROC where RunId = nnn order by thetime, nodeid;

Summary of Logons: from Logonoff – To ensure you are spreading logons around

select * from RptLogonoff where RunId = nnn order by logtime;

8: Constructing mixed workload tests with .LST files

The lines in the .LST files are basically the same set of parameters used by TdBenchRun’s command line, but allowing you to set up multiple tests with different numbers of users and queues and/or BTQ and BAT. Those parameters are:

  1. Name of a .sql, .btq, or .bat file relative to the TdBenchV5 directory. Be sure to include Scripts\
  2. Count of sessions. Note that this must be a number, such as 1
  3. Prefix for logon statement or the word: none
  4. Name of the query queue or the word: none
  5. Repeat count or the number 1. Note that this must be a number. If you specify a number, only the first line in a script will be repeated that number of times.

For example:

Scripts\Test04.Light.sql  20  Demo_Usr_TAC%  Demo_Benchmark.QueryQueueTac  1
Scripts\Test04.Medium.sql 10 Demo_Usr_MED% Demo_Benchmark.QueryQueueHvy 1
Scripts\Test04.Heavy.sql 3 Demo_Usr_HVY% Demo_Benchmark.QueryQueueHvy 1
Scripts\Test04.Update.bat 1 Demo_Usr_Upd01 none 1

This would create a test with a total of 33 query users and 1 update session. (Note that none of the parameters are actually used for the .bat file, however, all should be specified). 

9: Constructing scripts with BTQ files

The .btq files are run as-is. There is no modification, however the querydriver will logon the session using the logon ID(s) specified in the execution of TdBenchRun or in the .lst file. When a count is specified, the same script will be run in a corresponding number of sessions. The following lines should be at the top of the .btq script:

.set retlimit 10
.set retcancel on
database xxxxxxx_Benchmark;
exec TdBenchLogon;

The RetLimit/RetCancel will limit the amount of data returned to the client so your measurements can focus on the database performance. The TdBenchLogon will put a record of this session into TestJobTracking so it can be forced off for fixed period tests. It will also document the current timestamp, logon, and session ID from the DBMS which allows you to trace logging in the DBMS back to the client PC. (This is especially true for time which may be different time zone or even set slightly incorrectly on the client PC or DBMS server). Note that BTQ and BAT files will delay starting by 4 seconds plus the time set for TdBenchStartDelay which defaults to 10 seconds. The worker sessions for the SQL files delay start by TdBenchStartDelay, so the BTQ and BAT files start processing a bit less than 4 seconds later. 

The last lines in the script should be:

exec TdBenchStop;
.quit

If you have a test running an unlimited amount of time, the TdBenchStop will update TestTracking’s ActualStopTime. When multiple sessions are running concurrently, each may set the ActualStopTime, but the last one updating will be the basis of the test reporting. On a fixed period test, the ActualStopTime is updated by the control user calling the procedure TdBenchStopAll which updates the ActualStopTime and then forces off all sessions recorded in TestJobTracking

10. Constructing scripts with batch files

Batch files are often used to control a sequence of utilities, such as a FastLoad to a staging table and a BTEQ session to insert from staging to core. The query driver provides minimal support for batch file execution, other than assigning a Run ID and marking the starttime. To get the stop time set from the batch file, add the following line to your script:

echo exec %TdBenchDB%.tdbenchstop; | bteq .logon %TdBenchServer%/%TdBenchUser%,%TdBenchPwd%; > %2 2>>&1

Parameters that are passed to the batch file:

  • %1 – RunID as a 4 digit number with leading zeros
  • %2 – Log file, relative to the query driver directory, e.g. Logs/Run0001/yourfile.bat1.log
  • %3 – Session count (from command line or .lst file)
  • %4 – User logon or logon search pattern
  • %5 – Queue Table (or the word: none)
  • %6 – Repeat count (or what ever you want, it isn’t checked)
  • %7 – Line number of the batch file in a .lst file. This would be 1 if the batch file is the only thing run

Note that for %2, the log file name is made up of your batch file name, plus a number starting at 1 and increased for every session started and a .log suffix. You can use the following expressions to get just part of that string:

  • %~p2 - the path of the log file
  • %~n2 – the name of the log file without the extension
  • %~pn2 – The path and name of the log file without the extension

You can also use the variables from TdBenchProfile.bat:

%TdBenchServer%
%TdBenchUser%
%TdBenchPwd%
%TdBenchDb%

For example:

Echo  .logon %TdBenchServer%/%TdBenchUser%,%TdBenchPwd; > MyTempScript
Echo .Database %TdBenchDb; >> MyTempScript
type scripts/restofupdate.fld >> MyTempScript
fastload < MyTempScript > %2

Note that BTQ and BAT files will delay starting by 4 seconds plus the time set for TdBenchStartDelay which defaults to 10 seconds. The worker sessions for the SQL files delay start by TdBenchStartDelay, so the BTQ and BAT files start processing a bit less than 4 seconds later. 

Tags (2)