Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Tools
Tools covers the tools and utilities you use to work with Teradata and its supporting ecosystem. You'll find information on everything from the Teradata Eclipse plug-in to load/extract tools.
Teradata Employee

Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

For those not already familiar with the Teradata Workload Analyzer (TWA), TWA is one of the products affiliated with TASM (Teradata Active System Management). It provides guidance for the database administrator in defining the workloads for Teradata through the analysis of log data collected over a period of time and by offering recommendations on the workload definitions, classification criteria, and service level goals. It also provides a migration feature from priority scheduler.

Analysis of an individual workload can be initiated by selecting the ‘Analyze Workload’ right-click option available at workload report or clicking ‘Analyze’ tree node option. To further refine the initial set of workloads into one or more additional workloads, TWA uses DBQL data for workload analysis. Theoretically, a workload can be further sub classified into multiple workloads through additional classification criterion. Encouraging the user to sub-classify on any and all possible classification criterion would lead to confusion and many unnecessary workloads. Also, considering the operational performance points, there is a cap on the number of workloads for a database.

Workload Analyzer will guide the DBA towards appropriate classification criterion. At any given point in the analysis, the user is allowed to choose correlation and distribution parameters in the drop-down list, and then click ‘Analyze’ to analyze the associated usage patterns. He can drill deeper within a chosen cluster, or re-analyze by choosing different correlation and distribution parameters. Through trial-and-error and visualization, the user will decide which parameters identify the request group he desires to isolate most effectively. This trial-and-error process is streamlined by providing the user with distinct count and distribution range insight without having to ‘Analyze’. This can eliminate fruitless visualizations on a single user, or a tight distribution, for example.

The overall flow of the GUI can be better understood by looking at the flow chart below.

Flow Chart

This drill down is a recursive process for deeper analysis on correlation and distribution parameters. If not satisfied with the current analysis parameters, the DBA can select more appropriate parameters, as guided by reviewing distinct values and range for those other parameters.

For example, with respect to the distinct value counts, one particular workload could display the following characteristics:

  • User Name (24)
  • Applications (1)
  • Account Name (1)
  • Client addresses (2)
  • Queryband (3)
  •  Function (3)
  •  Urgency (1)
  •  AggLevel (8)
  • Estimated Processing Time (0-1000 secs)
  • AMP Count (0-1)

In this example, the DBA would be empowered to know that there is only 1 distinct application or account, and they all run at the same urgency, so trying to identify a correlation against different application, account or urgency values would be a wasted effort. However, the opportunity for correlation does exist with Users, Function and AggLevel. The DBA could pursue those correlation options. Similarly for the distribution parameter ranges: An estimated processing time range from 0 to 1000 seconds suggests a large variation of requests are included in this workload. The opportunity for identifying clusters is higher than if the estimated processing time range was simply 0-1 second.

The DBA can add clusters to the current workload for deeper analysis or clusters can be split off into a new workload. DBA can repeat this process until good set of workloads is defined or all unassigned clusters are assigned to workloads.

TWA uses an assigned and unassigned cluster concept. Each cluster (Accounts, Users, QueryBands) found during analysis are initially unassigned. Selected clusters are assigned after adding clusters to the current workload for deeper analysis or after splitting out into a new workload. Unassigned clusters remain for subsequent action by the DBA if desired. TWA brings back all unassigned clusters if the same analysis parameter is clicked again after displaying the informational message below.

Message

If unassigned clusters are not acted on by the DBA, the associated requests will be relegated to a different workload once the ruleset changes are saved. For example, consider the following set of six workloads that were generated after the 1st level of analysis on Accounts, where Workload A is defined with classification Account=A:

The DBA decides he would like to analyze workload A, who consumes 35% of the CPU. Based on some criteria (e.g. client user), he determines that one element should be isolated, and treated different than the other elements. He has two choices on how to do this: either split or add classification to existing workload.

If the user splits on the particular element, the result is a new workload, A2, with classification Account=A and Client User = xyz. A2 automatically has a higher evaluation order than the original workload A to assure client users of xyz execute within A2, and all other client users will execute within A. The CPU distribution divided between the old (A) and new (A2) workload as follows:

Alternatively, if the user chose to instead add classification to existing workload, (so that the workload classification of A is now Account=A and Client User = xyz), un-chosen elements will be designated “unassigned”, as depicted in the following diagram. If not further acted upon, they will end up executing within WD-Default because no other workload exists that would capture requests with classification Account=A and NOT client user = xyz.

It is suggested, to avoid accidental relegation of unassigned clusters to WD-Default or some other unexpected WD, that drill-down probes begin their first analysis step using the Split option. Additional refinements should then be done using the “add classification to” option against that new workload, so that unassigned requests will be relegated back to the original workload. (Example 2 below will demonstrate this technique.)

The DBA can select correlation parameters (“Who” and “Where”) and distribution parameters (“What” and “Exception”) at each depth of analysis. The DBA can also review the workload by viewing classification list after each depth of analysis and perform Undo (if needed). The Undo operation can be used to undo any previous analysis performed. It will delete assigned clusters from workload classification and bring them back as unassigned clusters for new add/split operations.

A Drill-down Example:

Here is a detailed example exploring the mechanics of workload analysis functionality and all the information provided within.

Step 1:

Select “Analysis->New Workload Recommendations” option. Define the DBQL inputs and Account String category for initial level of analysis.

Select all unassigned request group to new workload, WD_ABC.

At this point, the CPU distribution of all workloads (1) is shown below:

Step 2:

Select ‘Analyze Workload’ right-click option for cluster analysis. The ‘Analyze Workload” tabbed window as shown as below is displayed to select data filters for analysis.

Step 3:

Step 4:

The ‘Current workload classification’ list will display a summarization of the classification:

The numbers adjacent to current classification criteria type will show the total number of elements for correlation and min and max value for distribution classification parameters. E.g.

  • Account String (2) => Total 2 accounts are classified.
  • Estimated Processing Time (0 – 200 secs) => Workload is classified for Estimated Processing Time between 0 to 200 secs.

The DBA can view the detailed classification and exception definitions by clicking “View Classification” and “View Exception” buttons.

Step 5:

Select appropriate Correlation and distribution parameters. Note that there are 10 distinct Applications found for WD-ABC workload, making it a good candidate to analyze deeper. The following shows that analysis visualized:

Step 6:

Note that the ‘TWA’ Application has significantly higher CPU/IO and Avg EstimatedProcTime than other applications. It is selected for deeper analysis using the “Add Application clusters for deeper analysis” option on the right-click menu. The Report and Graph is updated for remaining unassigned clusters, while ‘TWA’ is assigned on the current workload for deeper analysis.

DBA can view classification for the current workload by clicking ‘Data Filter’ tab.

The CPU distribution of all workloads now looks like the following:

Please note that if unassigned clusters are not added to current/new split workloads before saving rule set to database then all queries arrived for unassigned clusters will be executed part of default workload (WD-Default).

If DBA selects Application again then all unassigned 9 clusters are brought back for subsequent operations (1 is assigned to current workload).

The user can then assign all remaining clusters using either the Add-to or Split options. The user chooses to select all 9 unassigned clusters and split it to new workload called WD-Others.

The CPU distribution now looks as follows, with no unassigned requests.

Finally DBA saves the rule set to database for activation.

A second Drill-down Example:

This example uses analysis against multiple QueryBand parameters to help identify and isolate various request clusters, or provide additional granularity on request clusters. One initial workload consumed a vast majority of resources, and a more granular breakdown of that workload is desired. Also, long-running outliers were noted in the analysis, and a goal is to have these outliers classified into their own workload so that different workload management techniques applied.

Step1:

Initial workloads are defined based on Account String using auto-generate option.

The CPU distribution of all workloads looks as follows:

Workload WD-ADW-DS is selected for further analysis as it is consuming 92% of the total CPU.

Step 2:

Queryband names and values are loaded in ‘Analyze Workload’ window after clicking ‘Analyze’ tree node. This system found total 5 distinct queryband names. These Queryband names with distinct value count are loaded automatically in QueryBand Filter list for analysis. However, this list can be viewed only when queryband is selected for correlation parameter list.

  • QueryBand (5)
  • AggLevel (7)
  • Function (5)
  • Region (7)
  • TopTierApp (3)
  • Urgency (3)

There are several potentially good analysis candidates here as denoted by the distinct counts. The user selects QueryBand as correlation parameter and query band name of Function from queryband filter list, then clicks “perform analysis”.

Step 3:

Below are the correlation and distribution reports/Graphs. A Total of 5 queryband values are displayed for QueryBand Name=Function.

Step 4:

Notice a possible distinction with Function = MIN, who included queries far lengthier than any other queries. Select ‘MIN’ row, Right-click and Add ‘MIN’ queryband value and split it to new workload ‘WD_ADW_Outliers’. This step is needed to assure unassigned clusters fall back into the original ‘WD-ADW-DS’ workload classification.

Step 5:

The QueryBand Function = MIN is split to new workload called ADW_Outliers and the remaining 4 functions are unassigned, falling back to ADW-DS if no other action by the DBA is taken on them. The CPU distribution now looks as follows:

The Correlation/Distribution reports and graph are refreshed with the remaining 4 unassigned queryband values for next add/split operation.

Step 6:

Further drill-down will be performed on the newly split workload ‘WD_ADW_Outliers’.

Select ‘Analyze’ from ‘WD_ADW_Outliers’ workload tree node for further analysis.

Select TopTierApp from Queryband filter list and click ‘Perform Analysis’ button.

Step 7:

The longest running queries are all common to not just Function =MIN, but also QueryBand Name TopTierApp = BODSS. Select ‘BODSS’ TopTierApp queryband value and add it for deeper analysis on current analyzed workload ‘WD_ADW_Outliers’.

The CPU distribution now looks as follows, with the 2 unassigned toptierapps being relegated to the original WD_ADW_DS workload rather than being part of the WD_ADW_Outliers workload:

Step 8:

Analyzing further on TWA_Outliers, notice that the distinct count for all the correlation parameter shows 1 (only 1 distinct value). This means that all requests in this workload are coming from the same combination of who parameters (Accout ADW_DS and Queryband Name Function = MIN and Queryband Name TopTierApp = BODSS). Now only distribution parameters can be used for further drill-down analysis. The Estimated Process Time can be used as distribution parameter for since the Estimated Processing Time range for current workload is wide (0.00 -158.00 secs).

Select ‘None’ as correlation parameter and Estimated Processing Time as distribution parameter.

Click ‘Perform Analysis’ button to generate distribution graph.



The 8 queries lie in last bucket (10 bucket – Range 142.20 – 158.00) long queries. However, another 8 queries in first bucket (Range 0.00-15.80) are comparatively short running queries, with a wide gap shown in buckets 2-9.

Our goal from the start was to isolate the long running requests found within ADW-DS workload and apply different workload management to them. By adding the last bucket to the ADW_Outliers workload, we will achieve the necessary workload definition to then apply those different workload management techniques. The user selects ‘Add Estimated Processing Time clusters for deeper analysis’ to ‘WD_ADW_Outliers’ after highlighting the last bucket from distribution report, overriding the min estimated processing time to 30 seconds, and overrideing the max estimated processing time to 999999 (essentially unlimited).

The CPU distribution now looks as follows, with the very short running requests relegated back to the original ADW_DS workload.

In summary, the workload classifications of interest within this example are now as follows:

WD-ADW-DS remains in its original form:

WD_ADW_Outliers is created with a higher evaluation order than WD_ADW-DS as follows:

 

27 REPLIES
Enthusiast

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Lots of information listed here Birendra. Many thanks for sharing.
While following the given example , for STEP 1 , in TWA -> Analysis-> New Workload Recommendations , I chose DBQL and gave a start and end date but I'm getting the following error :
Selected non-aggregate values must be part of the associated group.[62218]

I checked DBQL log to find error code 3504 but I have no power on TDWM analysis queries. MY question is that why this error is coming? What am I missing ?

TDWM user is properly created with 183 objects (we have TD 12)
Teradata Employee

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Let me know what TWA version (patch version number) are you using? What is DBS version (i.e. full DBS version)?
Teradata Employee

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Hi Karam,
This is problem with TWA 12.0 with latest TD12.0 database. TD 12.0 did some changes for aggregate queries which trigger TWA query to fail with 3504 error.
We have fixed this issue in TWA 13.00.00.06 using DR 152136. Let me know if you can upgrade TWA with 13.0. Otherwise, i can provide the TWA 12.0 efix for you. However, you need TWA 13.0 for Nth-Level drill down feature.
Enthusiast

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Thanks again Birendra.
I was able to generate Candidate Workload Report after installing TWA 13.10
Excited to try your recommendations !!
Enthusiast

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Hello,

I am new to TWA. I am using the 13.10 Version of TWA. Once, I select the Step1[Log as DBQL & Provide the Date Range & Category "Account String"], I am getting the Error:

"No 'Account String' Information Available with the given Input parameters. [62218]".

Can you help me out here. I have enabled DBQL Logging by executing the following Statement:

BEGIN QUERY LOGGING ON ALL;

And, I am also seeing Data being Populated in the DBC.DBQL* Tables. 

Teradata Employee

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

This message is displayed only when TWA did not found any accounts in DBC.DBQLogTbl for given Date Range i.e. if you were logged the DBQL data say for 04/28/2013 to 04/30/2013 and provided the DBQL inputs Data range out of log range say (04/31/2013 to 05/02/2013) in TWA then you did not get any account/users etc. Please check in DBC.DBQLogtbl table if some rows were logged for same Data range you have entered in ‘Define DBQL Inputs’ dialog of TWA. Let me know if you still have same problems.  

Enthusiast

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Hello,

I checked the DBQLogTBL Table and found that I have entries from 23rd April onwards. So, Instead of using 1st April-30th April as DBQL Date Range, I used Date Range from 23rd April-30th April.

Even with any Selected Option (Account String, Users, Profile, Application), I am getting the Error:

No "<Selected Option>" information available with the given Input parameters.[62218]".

Does it check any Specific Tables in DBQL which might be Empty. Cause, Data is getting Populated in the DBQLogTbl Table.

Teradata Employee

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Hi,

There is no differance between Date Range 1st April-30th April or 23rd April--30th April if data was logged only 23rd April onward.

I feel some problem with your data dictionary table "sys_calendar.calendar". This table may not be populated correctly while executing Dip Script.

Please run the following query and see if all the dates from 1st April to 30th April are listed in results set (30 rows).

SELECT calendar_date FROM sys_calendar.calendar

WHERE ( calendar_date BETWEEN '2013-04-01' AND '2013-04-30' )

AND day_of_week in (1,2,3,4,5,6,7) Order By calendar_date;

Please run below query and see if you are getting any DBQL rows as output.

SELECT StartTime,FirstRespTime,(extract (second from FirstRespTime)  + (extract (minute from FirstRespTime) * 60.0 ) + (extract (hour from FirstRespTime) * 3600.0 ) + (86400.0 * (cast(FirstRespTime as Date)- cast(starttime as date)))) - (extract (second from starttime) + (extract (minute from starttime) * 60.0 ) + (extract (hour from starttime) * 3600.0 ))  AS ResTime,

  NumSteps,NumResultRows,TotalIOCount,TotalCPUTime ,UserName,d.CollectTimeStamp,

  d.QueryID,UserID,AcctString,AppID,LSN,ClientID ,ClientAddr,QueryBand,

  ProfileID,d.ProcID,WDID,NumOfActiveAMPs,OpEnvID ,SpoolUsage,

  HotAmp1CPU,HotAmp1IO, ZEROIFNULL(EstProcTime) as TheEstProcTime,

  CAST(1 AS INTEGER)  as QueryCount,TotalCPUTimeNorm,

CASE WHEN StatementType (NOT CS)LIKE  'SELECT%' THEN 'SELECT'

 WHEN StatementType (NOT CS)LIKE 'DML%' THEN 'DML'

 WHEN StatementType(NOT CS)LIKE 'DDL%' THEN 'DDL'

ELSE 'OTHERS'

END AS StatementType,AMPCPUTime

FROM (

SELECT StartTime,FirstRespTime,NumSteps,NumResultRows,TotalIOCount,

  (AMPCPUTime + ParserCPUTime) as TotalCPUTime,UserName,CollectTimeStamp,

  QueryID,UserID,AcctString,AppID,LSN,ClientID, ClientAddr,QueryBand,

  ProfileID,ProcID,WDID,NumOfActiveAMPs,OpEnvID,SpoolUsage,MaxAMPCPUTime as HotAmp1CPU,

  MaxAmpIO as HotAmp1IO,EstProcTime , (AMPCPUTimeNorm + ParserCPUTimeNorm) as TotalCPUTimeNorm,

  StatementGroup AS StatementType,AMPCPUTime

FROM DBC.DBQLOGTBL,sys_calendar.calendar

WHERE ( StartTime BETWEEN TIMESTAMP  '2013-04-01 00:00:00.00'

 AND TIMESTAMP '2013-04-30 23:59:59.99' ) 

 AND (calendar_date = CAST(StartTime as DATE))

 AND day_of_week in (1,2,3,4,5,6,7) ) d 

UNION

 ALL

SELECT CollectTimeStamp,

CASE when ResTime  < 60 then collecttimestamp + cast ( cast( ResTime as int) as interval second)

 when ResTime  > 59

 and ResTime  < 3600 then collecttimestamp + cast (cast( ResTime as int) /60 as interval minute) + cast ( cast( ResTime as int)  mod 60 as interval second)

 when ResTime > 3559

 and ResTime < 86400 then collecttimestamp + cast ( cast( ResTime as int) /3600 as interval hour) + cast ( ( cast( ResTime as int)  mod 3600)/60 as interval minute) +  cast ( (cast( ResTime as int)  mod 3600) mod 60 as interval second)

 when ResTime  > 86399

 and ResTime  < 2678400 then collecttimestamp + cast ( cast( ResTime as int) /86400 as interval day) + cast ( ( cast( ResTime as int) mod 86400)/3600 as interval hour) +  cast ( ((( cast( ResTime as int) mod 86400) mod 3600)/60) as interval minute) + cast ( (( ( cast( ResTime as int)  mod 86400) mod 3600)) mod 60 as interval second)

END ,COALESCE(QuerySeconds/NULLIFZERO(QueryCount), 0) as ResTime,

  NULL,NULL,TotalIOCount,(AMPCPUTime + ParserCPUTime) as TotalCPUTime ,

  dbase.DatabaseName,CollectTimeStamp, NULL,UserID,AcctString,

  AppID,NULL,ClientID ,ClientAddr, CAST(  NULL AS VARCHAR(6160)) AS QueryBand,

  ProfileID,ProcID,NULL,NULL,NULL ,NULL,NULL,NULL,0,QueryCount,

  (AMPCPUTimeNorm + ParserCPUTimeNorm)as TotalCPUTimeNorm,CAST(  NULL AS VARCHAR(128)) AS StatementType,

  AMPCPUTime 

FROM DBC.dbqlsummarytbl, dbc.dbase,sys_calendar.calendar

WHERE DBC.dbqlsummarytbl.UserID = dbase.DatabaseId

 AND CollectTimeStamp BETWEEN TIMESTAMP  '2013-04-01 00:00:00.00'

 AND TIMESTAMP '2013-04-30 23:59:59.99' 

 AND (calendar_date = CAST(CollectTimeStamp as DATE))

 AND day_of_week in (1,2,3,4,5,6,7) ;

Enthusiast

Re: Advanced Concept for Workload Analyzer (TWA) - Workload Drill-Down

Hello,

As requested, I executed the mentioned 02 SQLs and they didn't give any Result-Set.

FYI: I am using the Teradata Express 14 Edition via VMWare Player. Does it have something to do with the Result-Set ? Or, how to ensure all the Required DIPs are properly executed ?

Thanks In Advance.