A Closer Look at How to Set up a Parsing-Only Workload

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

Note:  This content only applies to releases earlier than Teradata Database 14.10.05.  If you are on Teradata Database 14.10.05 or later releases, please look at the more recent posting titled:  A New Simplified Approach to Parsing Workload Selection.

When you think about priorities, you probably focus your attention on AMP work.  Whether you are using Teradata Active System Management (TASM) or Teradata Integrated Workload Management (TIWM), AMP work performed by requests are typically spread across different workloads, with their priorities based on the importance of the work to the business and how long you expect the work to run.

Parsing engine work on behalf of a query runs within a workload as well.  Work performed on the parsing engine is typically very quick and requires very few resources.  So why not run some, or all, of your parsing activity in a very high priority workload where it may be able to complete sooner when your system is busy,  just as you do with your short, critical AMP work?  This posting will explain how parsing priority is determined, and will illustrate the steps involved in increasing parsing priority.

Background:  Session Classification

A query's parsing engine (PE) work takes advantage of what is called "session classification". Session classification happens at session logon time and determines which TASM or TIWM workload will be used by tasks on the PE that do work (such as user validation) to get the session established before a query is issued.   

The workload that a session classifies to can be identified in DBQLogTbl in field SessionWDID.  The workload used for session handling is the same workload that will support all parsing and optimizing activity for the queries that are executed within that session.

How a Session Classifies to a Workload

A session classifies to a single workload, based strictly on the WHO classification criteria (referred to as Request Source criteria in Viewpoint) of the workload.  Information about a session (and subsequently the queries within that session) that allows mapping to a workload is available at session logon time and includes WHO classification criteria such as Account string, Application, Username and Profile.  All other non-WHO classification criteria that the workload might have (such as estimated processing time) are ignored at session classification time. That secondary criteria will be used only to determine which workload is chosen when a request executes on the AMPs.

TASM and TIWM determine the workload that will support session management (and secondarily request parsing) by doing the following:

  1. Find the workloads that satisfy the WHO criteria associated with the session
  2. Among the workloads with WHO criteria that satisfy the session detail, identify the workloads with the most number of individual WHO criteria  
  3. Among the workloads with the highest number of WHO criteria that match the request’s session detail, select the workload associated with the highest priority  

Consider these three workloads:

Case 1

MSI-Short:       Account = ‘$M1$MSI&I’, Est Time <= 20 Sec.                 ==> High priority 

MSI-Medium:    Account = ‘$M1$MSI&I’, Est Time <= 500 Sec.               ==> Medium priority 

MSI-Long:       Account = ‘$M1$MSI&I’, Est Time <= 2000 Sec.              ==> Low priority 

All requests whose Account is $M1$MSI&l will map (based solely on WHO criteria) to all three of these workloads.  In this case parsing for all queries with that account will execute in MSI-Short because it has the highest priority among the three workloads that match the WHO criteria.

Case 2

MSI-Short:       Account = ‘$M1$MSI&I’, Est Time <= 20 Sec.                             ==> High priority 

MSI-Medium:    Account = ‘$M1$MSI&I’, Est Time <= 500 Sec.                           ==> Medium priority 

MSI-Long:       Account = ‘$M1$MSI&I’, User = DRXU,  Est Time <= 2000 Sec.    ==> Low priority 

In this case, parsing for all queries with Account $M1$MSI&I for User DRXU will execute in MSI-Long because it has the highest priority among workloads with the greatest number of WHO criteria matches (Account and User).   However, all other queries with that Account (from users other than DRXU) will parse at MSI-Short, as it is the highest priority workload with one WHO criterion matching the session detail.

Case 3

MSI-Short:       Account = ‘$M1$MSI&I’, Est Time <= 20 Sec.                             ==> High priority 

MSI-Medium:    Account = ‘$M1$MSI&I’, User = DRXU,  Est Time <= 500 Sec.    ==> Medium priority 

MSI-Long:       Account = ‘$M1$MSI&I’, User = DRXU,  Est Time <= 2000 Sec.   ==> Low priority 

In this case, parsing for all queries with Account $M1$MSI&I for User DRXU will execute in MSI-Medium because among the workloads that have the greatest number of matching WHO criteria, MSI-Medium has the highest priority.  Only requests from a user other than DRXU for Account $M1$MSI&I will parse in MSI-Short.

In summary, among the workloads with the highest number of WHO criteria that match a session’s characteristics, the workload with the highest priority will be used for session management as well as for the parsing of all queries in the session. 

Setting up Special Parsing-Only Workloads

With this background,  you can set up one, several or even many workloads specifically for parsing if you wish.  Or you can just use the existing workloads and observe the information in the SessionWDID field in DBQLogTbl output using this new understanding to make better sense of the priority at which parsing is taking place.

If you decide to set up a single higher-priority workload for all parsing activities for all requests, then you will want to create a new workload and give it a very high priority.

To make sure this workload is chosen for the SessionWDID of all requests, consider including multiple WHO criteria (Username, Account, Application, for example) using the broadest possible matching scope (Username = *, Account = *, and Application = *).   Make sure to use a greater number of WHO criteria than any other active workload uses.   By making use of wild cards for three WHO criteria, you are setting up the condition where all requests are likely to match to that workload for their parsing work. 

If such a parsing-only workload existed, then in Case 2 and Case 3 above, all MSI users would match to this single high-priority workload for parsing.  That it because this parsing-only workload described above contains three matching WHO criteria (Username = *, Account = *, and Application = *), a greater number of WHO matches than the other defined workloads.  

It is recommended that the Profile classification criterion not be used in defining a parsing-only workload, unless all users accessing the platform have an actual Profile assigned.   When there is wild card classification on Profile (Profile = *), any session that logs on without providing a Profile will be disqualified from the workload.

A Second Important Step

Because you only want parsing work to run in the high-priority parsing-only workload, and you are using wild card WHO criteria, an important second step needs to be taken so you can prevent AMP work from running there.  You must put in dummy secondary classification for that parsing-only workload so that no query will ever successfully map to the parsing-only workload for its AMP work.  

For example, you could create a special dummy table just for this purpose that you are confident no one actually accesses. Then specify that table in the parsing workload’s secondary classification criteria.  Or you could setup multiple criteria that are contradictory and therefore highly unlikely to ever be represented together in a query.  For example, you could add query characteristic criteria that mandates that only single/few AMP queries will run in the workload, and at the same time include very large estimates, as shown in the following example.

You could consider placing this parsing workload on the SLES11 Tactical Tier, but be very careful in doing that. If some of the parsing activity consumes more than 2 CPU seconds per node, the parsing task risks being demoted to a lower priority workload due to the automatic tactical exception. If you wish parsing to take place at a tactical priority, consider increasing the CPU threshold for demotion in the parsing-only workload’s tactical exception definition, so that it accomodates max expected parser CPU times.

Conclusion

Establishing a parsing-only workload is not going to reduce the resources required to do parsing. If the plan is complex or many decisions have to be made by the optimizer in producing the plan, the same number of CPU seconds will be required to accomplish parsing. However, a high priority parsing-only workload may speed up the time to do that parsing when the system is busy, because parsing is running at a higher priority.

Of course, as is true with all workload management decisions, you need to examine the tradeoffs involved.   When you increase the priority of one type of work, by definition you reduce the priority of some other work.  So make sure you are keeping an eye on the overall balance of priorities on your platform.  As a result, you may want to only apply this parsing-only workload technique to just a critical set of queries, rather than all active queries.

20 Comments
Teradata Employee

Hi Carrie,

I used the Parsing Workload you describe at one of my customers who is running TD14.10/SLES11 with VP15.

This workload is working perfectly. But, if I do not use this WD all the parsing work is mapped to the H-WD default workload even if the corresponding WDs use a Request Source criteria  (the Account String in my case); is that a 'default' rule for parsing work?

My best regards,

Roberto Rundo

TD Solution Architect - Italy

Teradata Employee

Roberto,

There is no "default"  workload for just parsing.   Parsing will happen at the same workload that the session's logon uses.   Depending on the number of workloads defined and their characteristics, it is likely that there will be different workloads selected for parsing across different applictions.  But it is possible parsing could be centralized on on or just a few workloads.  Look at the SessionWDID in DBQLogTbl to validate where parsing took place.

Parsing will only happen in H-WD for a query if the session had an account string that started with $H and there were no other workloads at  a higher priority that the session information could have matched to.

Thanks, -Carrie

Teradata Employee

Hi Carrie,

Thanks for this blog. I have one query:  

As per your blog " Expediting Express Request" if we enable EnableExpediteExp parameter and reserve some AWTs then it will also help in Speed-up the Parsing (where express request get AWTs during Parsing through WorkType08/09/10).  If we enable this parameter and reserve some AWTs. Do We still need to create a Parsing-WD ?

Thanks in advance.

Teradata Employee

The purpose of the Parsing-only workload is to elevate the priority of CPU usage on the parsing engine when a session logs on or a query undergoes parsing/optimization.   This CPU usage on the parsing engine has nothing to do with AMP worker tasks.  The PE is a separate module from the AMP.   PE-level CPU is consumed by special tasks on the parsing engine, unrelated to AMP worker tasks.

See my blog posting that describes parsing engine tasks:  http://developer.teradata.com/blog/carrie/2012/11/database-tasks-that-run-on-the-parsing-engine

Expediting express requests only has an impact to express requests that are sent from the parsing engine to the AMPs when data dictionary access is necessary during parsing-level activities.  It could speed of AMP activities that originate from parsing/optimizing, etc.  if the system is out of AMP worker tasks on the AMP.

These are two separate capabilties.   One or the other may be helpful, or both may be helpful, it depends on your environment and what problem you are trying to solve.

Thanks, -Carrie

Hi Carrie

Would this parsing only workload work in a similar fashion for SLES10 V14.00/14.10 ?

i.e. could we force query parsing  into a high prioirty TASM group ..

Thanks

Nick

Teradata Employee

Nick,

Parsing prioritization is based upon the concept of session classification.  Session classification happens with TASM if you are on either SLES 10 or SLES 11.   It also happens on SLES 11 with TIWM (workload management for the appliance and EDW non-TASM platforms). 

As long as you are using workloads, you will see WDIDs and SessionWDIDs show up in DBQL, and SessionWDID will tell you what workload parsing took place in.   So if you have workloads, there is no reason you cannot manipulate session classification to point to a high priority workload for your parsing needs, if you wish.   

Parsing prioritization is not determined by the operating system (SLES 10 vs. SLES 11), but rather if you are using workloads and if session classification is taking place.

Thanks, -Carrie

Enthusiast
Hi Carrie,

We recently introduced the PE-Only Workload (POW) on a special system (14.10+SLES10) to have all user PE-work  to be classified into that POW. 

We have followed the criteria shown below, but still we are not seeing a single user getting the SESSIONWDID as POW ID.

Not sure if I am missing anything here.

POW 

Classfcn Criteria: USER: = *ACCTSTR: = *APPLIC: = *PROFIL: = *ALL_AMP: excludeMIN_EPT: 0:01:00.000

Enforcement priority: Tactical

EvalOrder: 1 (Top on the list)

No Exceptions, No throttles, No SLGs

Teradata Employee

Geeta,

Try redefining the POW workload classification without the Profile=* part.  Unless you are 100% sure that all your users have profiles, it could be problematic in this case.

Just to re-iterate one of the points mentioned above related to Profiles:

"It is recommended that the Profile classification criterion not be used in defining a parsing-only workload, unless all users accessing the platform have an actual Profile assigned. When there is wild card classification on Profile (Profile = *), any session that logs on without providing a Profile will be disqualified from the workload."

Any session where a user does not have a profile assigned will not classify to the POW workload the way it is defined.

Also,  I'd suggest you reconsider how you setup the criteria to exclude queries from executing in this Workload.  What you had was the followoing:

                Classfcn Criteria: USER: = *ACCTSTR: = *APPLIC: = *PROFIL: = *ALL_AMP:

                excludeMIN_EPT: 0:01:00.000

The exclude above looks odd to me.  I don't believe you can exclude on a minimum.  What you want is to add query characteristic criteria like EPT > or = some very large number of seconds AND that does not use all AMPs (as shown in the posting above), or some query characteristic criteria that it will be impossible for any query to classify to, like a dummy table or view name that no query will access.

Thanks, -Carrie

Enthusiast

Actually, we tried this criteria (where we used exactly same dummy characteristics as shown in your screenshot) on only single UserID which was reporting higher Parser Time. But it was not making use of the POW under SessionWDID for its parsing work.

So we thought to try the POW criteria on every user to see whether this new feature works on our box or not.

But still we faced the same issue (where no one user is SessionWDID is making use of POW).

I agree with your point not to use the Profile* for entire user community which may lead to bad consequences. But we did that just to isolate whether the POW criteria make the PE work to complete under designated workload or not.

As you suggest, we will try again with below by tweaking the EPT a little and keep you posted.

Classfcn Criteria:

USER: = *  ACCTSTR: = *     APPLIC: = *    PROFIL: = *  ALL_AMP:

  EPT >=10000000000000

Teradata Employee

Geeta,

I am sorry you are still having problems with this.

I'm still somewhat concerned about your dummy classification on query characteristics.  You want to make the query characteristics classification criteria to be something impossible for any query to ever achieve.  So it would be better to say "NOT all-AMP" and  EPT >=10000000000000, instead of "all-AMP" as you have it.    And remove the criteria around Profile.

See if that helps.

Thanks, -Carrie

Enthusiast

Carrie,

Finally the criteria below making the POW to be recorded in SessionWDID. And I am trying to understand the extreme impact of including PROFILE *. We have every user on the system assigned to a specific profile. So in that case there should not be a problem i believe. But please let me know if there is any other downside that internally can cause. 

 QUALIFY IF:

         Step Time >= 11:59:59.000

     AND Does not have all-AMP steps

     AND AcctString = *

     AND Application = *

     AND Profile = *

     AND User = * 

thanks for folloing

Teradata Employee

Geeta,

If you are sure that every user has a profile, then there is no down side of including Profile =  * in your classification.   It is only a problem if a user has no profile.  In that case, any query coming from that user will never classify to a workload that says Profile =  *.

Thanks, -Carrie

Enthusiast

Thank you.

With that classification on all the users (via Profile*), we are seeing multiple issues (like same session id using POW and other Workload for parsing phase under SESSIONWDID), working with GSC for further help.

Enthusiast

Hi Carrie,

Today i noticed an interesting case where SessionWDID=WDID=FinalWDID=Same Workload id number.

We have parser only workload, let us say the WDID of PARSER-only workload=1.

For one of the sessions procedure CALL statement, i am seeing the SessionWDID=1, WDID=1, and FINALWDID=1.

I am seeing this behavior with statementype=Call, and also statementtype=Check Workload (specifically Check workload for begin mload).

I am kind of confused why these 2 statetementtypes are showing 3 types of WDIDs as PARSER-only workload.

Teradata Employee

Geeta,

There is an explanation for what you are seeing.

Both Stored procedure CALL statements and Check Workload For statements are handled differently from other statements.  They will always run in the SessionWDID workload.   

The stored procedure CALL statement is just the very, very small amount of work between the SQL statements within the stored procedure.  It often reports zero CPU usage.  It used to run in WD-Default, but there were complaints from customers that they were seeing things running WD-Default they had not anticipated, when they had carefully set up classifiction criteria to prevent that from happening.

As of 14.10 that logic was changed to run all CALL statements in the SessionWDID for that session.   This may be the same workload as the parsing-only workload.  This is not a problem if that should happen.

If you don't want that to happen, you can set up classification critieria for a workload where you want the CALL statements to execute that uses the Target Criteria "Stored Procedure" option, and specify all stored procedures to run in that workload.  In that case the CALL statements will run there, but all the SQL statements within the stored procedure will run indendent from the CALL in whatever workload they classify to.      

The Check Workload For statement works the same way.    That statement will always run in SessionWDID.  But all that statement does is get the  object names of the objects involved in the load utility from the parser and passes them to the dispatcher.  Similar to the CALL stored procedure, there is no AMP activity involved in the Check Workload For statement. 

Thanks, -Carrie

Enthusiast

Good to know this.Thank you.

Are there any other statements applicable for falling always in SessionWDID? I think there are few other statement types where they consume 0 CPU, just wanted to know if all those statements also fall in SessionWDID. 

Teradata Employee

Geeta,

There are a couple of other statements types that will classify to the parsing-only workload.  Commands that using only the parsing engine, such as SHOW and/or HELP commands may run at SessionWDID if they are simple.

I have some additional info on that, and will try to add a new blog posting out in the next few days that goes into detail about this.  

Thanks, -Carrie

Enthusiast

Great, thank you.

Fan

Carrie,

Couple of questions - 

Does setting up parsing only workload help the tactical query in any way?

Also, when you setup criteria for parsing only workload as 

Username = *, Account = *, and Application = * and setup secondary criteria finalrowcount>=100000000000 and others , how is the parsing request will satisty this and run in this workload?

Teradata Employee

For question #1:  Requests that classify to a tactical workload will not be helped by setting up a parsing-only workload.  That is because tactical queries will do their parsing work in the highest priority workload that their Request Source criteria matches, and that will be the tactical workload that their AMP work runs in.  

For question #2:  The secondy criteria that you add for a parsing-only workload will not be used by TASM when it selects a workload for just the parsing activity.  This is explained the blog posting above in the section the follows the heading:  HOW A SESSION CLASSIFIES TO A WORKLOAD.

The purpose of placing non-Request Source criteria on the parsing-only workload is to prevent AMP work from any request from classifying to a workload that is intended only for parsing.  Since parsing workload selection will never use non-Request-Source criteria for classification (parsing work will only consider Request-Source criteria), secondary critieria will have no impact on parsing being able to match to that parsing-only workload.  

Parsing uses Request-Source criteria only.

AMP work uses Request-Source and secondary criteria.

Regards, -Carrie