Intrepreting DBQL DelayTime in Teradata 13.10

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

Starting in Teradata 13.10, there is a single delay queue for all throttles.  This means that queries delayed by system throttles will reside in the same queue as queries delayed by workload throttles.  In earlier releases, delay queues were set up independently by type of throttle, and each workload throttle had its own dedicated queue.   

Bringing together all delayed objects into a single queue streamlines the entire throttling experience and makes it easier and more accurate to manage internally.  However, as a side-effect, the DelayTime field in DBQL needs a second look.  DelayTime takes requires slightly different interpretation in 13.10 than you gave it in earlier releases.

DBQL WDDelayTime has not changed

First, let’s consider what has NOT changed in DBQL delay reporting for 13.10.  WDDelayTime has the same meaning as it always has had.   It represents the time in seconds that the query was delayed in the delay queue due to a workload throttle.   

You will not see this column in DBC.QryLog view, as that view only references purely DBQL fields.   Use the view titles QryLogTDWM to see the delay columns, and other TDWM fields, such as WDID.

DBQL DelayTime   

In 13.10, DelayTime is the total time spent in the delay queue, either due to one or more system throttles, just a WD throttle, or a combination of both.   It doesn't matter what reason there was for the delay or how many throttles contributed to the delay.  This is the total wall clock time the query was delayed.

However, in 13.10 you cannot subtract WDDelayTime from DelayTime, take the difference, and assume that is time delayed by a system throttle.  You could do that in earlier releases because WD throttles had their own separate delay queue, distinct from system throttles.  With separate delay queues, as before 13.10,  you could safely assume that a query first went into the WD delay queue, and only when released from that delay queue could the query be held in the system throttle queue.  There was no overlap in delay times.  Now there is.  In 13.10 there is one single delay queue, so queries can be in that queue due to being delayed by both types of throttles at the same time.  

When delayed by both types of throttles

If you are only delayed by a WD throttle, then both DelayTime and WDDelayTime metrics will be the same.   If you are only delayed by a system throttle, WDDelayTime will be null, and you know for sure that the time reported in DelayTime was caused by one or more system throttles.

For example, if DelayTime showed an hour, and WDDelayTime showed 20 minutes, we don't know if the system throttle mandated a delay for that query of 40 minutes or for 1 hour.   This is because the system throttle and the WD throttle could both be keeping the query in the delay queue during the same number of minutes.  We can conclude, however, that that query was delayed for a total of 1 hour, and that approximately 20 minutes was due to the WD throttle.  But we can only conclude that the system throttle caused a delay of possibly as long as an hour.

Conclusion

DelayTime is the total time spent in the delay queue:

  • Due to one or more system throttles
  • Due to a workload throttle
  • Due to a combination of the above
  • Due to a utility throttles (delayed on the CHECK WORKLOAD END statement)

 WDDelayTime is the total time delayed by a workload throttle.

If your query is only delayed by a WD throttle: 

  • WDDelayTime = Time actually delayed
  • DelayTime = The same time as in WDDelayTime

 If your query is only delayed by a system throttle:

  • WDDelayTime will be NULL
  • DelayTime = Time actually in the delay queue

If your query is delayed by both:

  • WDDelayTime = WD throttle delay time
  • DelayTime = Wallclock time delayed by both, does not identify system throttle contribution
38 Comments
Hello Carrie,

We are using TD 13.10, and TDWM via Viewpoint.

Presently we are using Workload Throttles, based on workload classification criteria that includes Request Source and Estimated Processing Time. We try and put queries with a quick estimated processing time into the initial high-priority workload (let's say WD-01). Queries with a longer estimated runtime get classified into the lower priority workload (WD-02 or perhaps even WD-03).

We have 2 levels of exception to demote each workload (within WD-01 and WD-02) to a lower priority workload if CPU time exceeds certain threshold. This is to handle situations where the estimation has not been accurate.

In essence, we are trying to design a situation where every top-level workload classification has a non-shared demotion path (WD-01 -> WD-02 -> WD-03).

Unfortunately, the exception rule has the side-effect of continuously moving queries from the higher-priority workloads, to lower priority workloads, which frees up throttle slots in the higher-priority Workloads. This creates a situation where the number of active requests start to significantly exceed the number of concurrent active sessions we would have liked to see on the system.

Do you have any suggestions on how to solve this dilemma?
Teradata Employee
Etienne,

Counters are always decremented at the time of a query's demotion. The reason that is done is to ensure that the concurrent number of queries active in that workload is accurately reflected by the counters. This is a good thing, and there is no way to surpress that.

As you are on 13.10 (and have the new "common classification" capabilities) you can create a system throttle with the identical classification criteria as the higher level workload, and give the system throttle the same query limit as the workload throttle has. The system throttle counter will not be decremented until each query actually completes, no matter how many times the query might be demoted.

New queries coming into the system will have to satisfy both the workload and the system throttle rules in order to run, which will result in greater control in the number of concurrent queries when workload demotions are taking place.

This won't help you now, but in 14.10 we are planning on introducing a "Group Throttle" at the workload level. Several different workloads that have throttles can be grouped and among them share a single combination throttle limit, in addition to their individual throttle limits. That could help in situations such as the one you describe.

Thanks, -Carrie
Many thanks Carrie,
We will try your suggestion.

Kind regards
Etienne
Teradata Employee
Hi Carrie,
Why do the DelayTime and WDDelayTime columns in Dbqlogtbl allow nulls? If a query is not delayed the value should be zero instead of null. I tripped over this when looking at average delay times and found average delay time was greater than average total elapsed time for some workloads due to the null values being excluded form the divisor.
Teradata Employee
Ian,

Because DBQL tables contain a large number of fields, and efficiency in logging data has been a priroity, the philosophy has always been to not make the effort of logging a field if it is not necessary.

Most fields in the DBQLogTbl are initialized to zero at the beginning of a request. The code that logs values checks to see if there are any values greater than zero in a field before it logs the data. So zero fields bascially do not get logged, and end up reporting NULLs.

DelayTime and WDDelayTime fall into this category.

Thanks, -Carrie
Teradata Employee
Carrie,

Does DelayTime field in TD13 function the same as field in Td13.10?

Thanks
Teradata Employee
In releases earlier than 13.10, the DelayTime column in DBQL reflects time delayed by either a system throttle or a workload throttle, but not the combination.

In pre-13.10, DBQL only accounts for whatever delay queue the query was on last before it was released. Meaning, if the query was on the system (or object) delay queue followed by the WD delay queue, the delay time will only have the WD delay time. If the query was on the system (or object) delay queue only, the delay time will reflect that time.
HY
Teradata Employee
In release 13.10, is it possible that ProcID and QueryID exists in QryLogObjects but not in QryLog ?

In my client site, I found this issue happened. As a result, I'm not able to get column access by user for some mismatch ProcID and QueryID

Thank you.
Teradata Employee
Both ProcID and QueryID do indeed appear in the two views that you mention. I checked this on one of my 13.10 systems. If you are finding that information from those two views do not correlate the way that you expect them to, so that you are not able to get column access for some of your users, please contact the Support Center and get their assistance in resolving this situation.

Thanks, -Carrie
Hi,
i want to know more about DBQL? can u plz explain me about dbql?
Teradata Employee
DBQL provides a series of predefined tables and views that can store historical records of
queries and their duration, performance, and target activity based on rules you specify. You control whether or not to populate the DBQL tables. If you choose not to use the feature, the tables remain empty. To use it, you begin and end collection for application name, a user or group of user and on or a list of accounts.

You can use DBQL to:
• Capture query/statement counts and response times.
• Validate that nonunique secondary indexes (NUSIs) are actually used.
• Analyze SQL text and process steps.
• Log optimizer query as XML documents.
• Make further refinements to workload groups and scheduling.
• Discover potential application improvements.
• Understand basic workload trends, such as growth patterns and workload mix, and
identify outliers.

You can read more detail about DBQL and how to use it in the Administration manual at the Information Products web site. There is an entire chapter called "Tracking Processing Behavior with DBQL." Go to:

www.info.teradata.com

Thanks, -Carrie
Enthusiast
Carrie,
I am seeing many instances of null entries for WDID within our DBQL log data and was wondering if you could explain why. I had over 10 million for each of the past two months. Even though they all show with no AMPCPUTime, they do show elapsed times!
Enthusiast
Mark,
Based on ODBC setting, queries gets submitted twice. First come for parsing and second for actual execution. Queries which are with WDID as NULL and AMPCPUTime with zero will have corresponding ParserCPUTime. Query which comes for actual execution will have corresponding WDID and AMPCPUTime.

You can disable parsing in ODBC setting to avoid it. Please check if you see same behavior. So far I have seen it for ODBC and not for CLIv2 connections.
Teradata Employee
Thanks Shrinivas! Excellent point.

I'd check out Shrinivas's suggestion first, it sounds like a fit for what you describe. Another thing to be aware that there were some issues around DBQL reporting null WDID numbers for certain DBQL rows in the older releases ( such as 12.0, I think). That has been fixed in the current releases, definitely fixed in 13.10. You would need to call the support center to get more information on whether the release you are on is effected.

This assumes you are using TASM and have workloads. If you are on an appliance platform or are not a TASM user you will see null in WDID for all DBQLogTbl rows.

Thanks, -Carrie
Enthusiast
Hi Carrie,

What is the difference between zero DelayTime and NULL DelayTime? I can understand that if request is not delayed then Delaytime will be NULL. What is zero delaytime?
Enthusiast
Thank you to both Shrinivas and Carrie. By default parsing is enabled for ODBC. We are running on version 13.10. I will attempt to test with my ODBC connections and see if this changes for my DBQL rows.
Is there any benefit or imapct of disabling parsing within ODBC by default?
Teradata Employee
Shrinivas,

DelayTime will be reported as zero for any delay that is greater than zero but less than one second. DelayTime used to be reported in hundredths of a second in 12.0, but has been changed in current releases to report at one second, and is it has an integer data type. The code will calculate a DelayTime value if it finds hundredths of a second are > 0, and initial values are stored internally in hundredths of a second. So with integer division, anything from 0.01 to 0.99 will show up as zero.

Thanks, -Carrie
Teradata Employee
Mark,

Unfortunately, I am not familiar with client conventions and am not able to provide an answer to your question above concerning disabling ODBC parsing. If Shrinivas is not able to explain tradeoffs to you, consider asking this question on one of the other Teradata forums or opening an incident with the support center.

Best regards, -Carrie
Enthusiast
Mark,

We have disabled "Parsing" for most of our applications, which confirms Teradata specific SQL. This improved application performance. At the same time it reduced number of records logging to DBQL. This helped for queries, which are hitting to DBQL tables and reporting correct for number of queries per day (and/or concurrency).

Windows: Disable Parsing
odbc.in: NoScan=[Yes | No]

As per documentation Caution:
Be aware that once above option is set the ODBC driver will not convert any ODBC SQL syntax and Teradata may not handle this SQL syntax. If the SQL statements contain ODBC-specific syntax, do not enable this option. Setting this option while using ODBC-specific syntax in the SQL statement results in Teradata Database reporting errors.
Enthusiast
Hello Carrie,

Thank you for explaining zero DelayTime.
Teradata Employee
Shrinivas, I appreciate your input to Mark on his ODBC questions. Thanks, -Carrie
Enthusiast

Hello Carrie,

My Question is: If a Query is in Delay Queue, then it means it is waiting for the say, Queries Count be lower than the Defined Throttle Limit. 

The DBA can Release such Delayed Query by ViewPoint/PMON. What happens in this State ? Cause, the Throttle Limit has been Reached. So, if the DBA-Released Query is Executing, does that mean one of the Currently Active Query has to stop Execution & enter Delay State to maintain the Throttle Limit. 

Teradata Employee

When a query is manually released from the delay queue, or if a query is demoted into a workload that has a throttle whose limit has already been reached, the query count for that throttle will temporarily be allowed to be above the limit.  This means it will take longer for other queries in the delay queue to be released and run.  

Once a query begins executing, it is never interrupted and returned to the delay queue.

Thanks, -Carrie

Hi Carrie!

Just checking (because we seem to have different calculations at our site):

Is the total delay time in 13.10 included within the ElapsedTime, which is calculated as:  FirstRespTime - StartTime?   It's not included within FirstStepTime - StartTime, which is the parsing time, is that right?  

Thanks,

Joanne

Teradata Employee

Hi JoAnne,

Yes, you can subtract FirstStepTime from StartTime and that will include delay time.  We used to do that before V2R6 in order to capture the impact of throttle delays on response times.  However it includes other things as well, like parsing time.  That is why it is preferable to use the DelayTime field to assess throttle delays.

StartTime is when the query first arrives on the PE, before it has undergone parsing or optimization.  DBQLogTbl FirstStepTime is filled in  prior to sending the first step, which would be the time after a query had been released from the throttle delay queue, if it had been delayed by a throttle.  The query delay happens in the dispatcher, after the query plan has been built but before the first step is dispatched.

Thanks, -Carrie 

Enthusiast

Hi Carrie,

In the past, whenever I've had to refresh a table (delete all of the rows and insert all new rows), I have used two tables and have used a view to switch between the tables. So, if the view is currently pointing to table #1, then I would populate table #2 and then do a "REPLACE VIEW" command to now point to table #2. So, the users that are accessing the data through the view will be switched from one table to the next without the risk of accessing an empty table.

After replacing the view to point to table #2, I would typically lock the old table (table #1 in this case) in exclusive mode and delete all of the rows from it. Locking it in exclusive mode prevents the delete from happening if there are any remaining queries still running against the table.

Now with the possibility of a query getting delayed in the delay queue and since the plan is determined when the query is submitted, but the locks are not taken on the tables at that time, it's possible that a query could generate a plan that accesses table #1 (in the example above), but doesn't execute until table #1 has been deleted (since no lock is acquired).

So, my question is, is there anything that can prevent this from happening? For instance, is there any way to force a query to be re-parsed after the delay when it is executed so that the query would now point to table #2?

What other methods can you think of that might prevent this negative situation?

I could wait and do the delete immediately before the insert in the next refresh cycle, but then I'd have two populated tables at all times which would take up more space. This approach wouldn't eliminate the risk, but it would greatly reduce it especially if the refresh cycle is daily or weekly.

Thanks,

Barry

Teradata Employee

Hi Barry,

Even if you were not delaying these queries, there is still a small window of time between query optimization and query start time where the replace view command could get submitted and where the same thing could happen.   But I can see how it is much more visible when you are delaying queries, and likely more frequent as well. 

About the best you can do to avoid either a delay on the exclusive lock or a bad plan because the query is accessing an empty table is to cause the query to error out when this out of sync condition exists so it can be resubmitted (against the correct table).  You probably know better than I do ways that could be accomplished, but here are some thoughts:

  • Drop the table instead of deleting it, so the query will not find the table and will error out.
  •  
  • Issue an alter of the table and change something inconsequential about the table (probably too intrusive and you'd need to reverse it).  That would cause the query not to run because the table's version number in the database changes by ALTERing the table to something different from the table's version number in the plan.

     
  • Add an abort statement to the query and if the table is empty abort it and resubmit it.

     

Might be other ways to do this more cleanly.  Dropping the table is probably the easiest among the things I could think of.  Might want to post this question on Teradata Forum as well and get a broader audience.

Thanks,  -Carrie

Enthusiast

Thanks Carrie. I've never had a problem with the "small window" that occurs without the delay queue. I think that the best thing to do is to either drop or rename the old table rather than delete all of the rows so that the user would get an error in this case. I think that would be preferable to getting an incorrect result...

Thanks again,

Barry

Teradata Employee

I agree!

Good hearing from you Barry.  Keep in touch.

Thanks, -Carrie

Teradata Employee

Hi Carrie,

As always I  value your postings and articles and views. Based on the conclusion above, I wrote some a simple scripts and found incorrect restults. Did a deep dive into the query and found that  the  inside DBC.DBQLogTbl  the column data type doesn't match and as such my report bit away from the actual expected results. Your thoughts on this please. There is no major impact because of this but would like to know the  reason behind this. 

DelayTime FLOAT FORMAT '----,---,---,---,--9.999',   

WDDelayTime INTEGER FORMAT '--,---,---,--9',

Ref:

If your query is only delayed by a WD throttle: 

  • WDDelayTime = Time actually delayed
  • DelayTime = The same time as in WDDelayTime    (/* When I compare the results does not match because of the decimal values*/ )

 If your query is only delayed by a system throttle:

  • WDDelayTime will be NULL
  • DelayTime = Time actually in the delay queue

If your query is delayed by both:

  • WDDelayTime = WD throttle delay time
  • DelayTime = Wallclock time delayed by both, does not identify system throttle contribution
  •  
Teradata Employee

My earlier posting (Data type differences in DBC.DBQLogTbl) were findings on TD14.10... Thank you. 

-- Murali

Teradata Employee

Murali,

You are correct that DelayTime and WDDelayTime are formatted differently.    Starting in 14.0, DelayTime is made into FLOAT so a user can see microseconds.  

WDDelayTime is still an integer, but in 14.0 it really doesn’t matter anymore.

That is because starting in 14.0 there is a single delay queue for all throttles.  The time a request is held in the delay queue due to different objects (system throttles vs. workload throttles, etc.) can no longer be differentiated.  So all you see as a user a single delay time that accounts for all types of throttles.  If TASM workloads are being used, the WDDelayTime will essentially be the same as DelayTime, except they are being reported using different formats.  So you can just look at DelayTime, and not be concerned with comparing the two fields.

In 15.0, WDDelayTime will be removed from the DBQLogTbl, so then it will be simpler as there will only be a single field to examine.

Thanks, -Carrie

Enthusiast

Hi Carrie,

One of our application is missing its SLG very often ,So I am just wondering which is the best place to  analyze how often the SLG is being missed is it querylog or resusagesps.

Thanks

Teradata Employee

You would have to use DBQLogTbl to examine query-level detail.  The ResUsageSPS table reports information about workloads, but not individual queries within a workload.

DBQLogTbl has a field named ResponseTimeMet in 14.0 and 14.10.   It was called SLGMet in earlier releases.  You could use that field to assess how frequently queries from a given application are missing their SLG, and what times of day this is happening.  However, for this DBQLogTbl field to be filled in you have to have set an SLG in TASM for that workload.  

Thanks, -Carrie

Hi Carrie,

Thank you for expaling the DelayTime. Have a question. If I see Null in delaytime filed (no WD or System delay) and I see a difference in StartTime vs FirstStepTime in DBQL, can I assume the difference is the ParserTime? or are there any times inculded in this difference? In another words, I am trying to measure Parsing time of a SQL stmt. Thank you in advance!

Teradata Employee

Haritha,

Basically,  you can take the difference between StartTime and FirstStepTime and assume that it represents parsing time.  But if there delay time due to a throttle, then that difference will also include delay time.   You have said that DelayTime is NULL.   So you are fine with this approach then.  But sometimes DelayTime is reported as zero instead of NULL, and that may look as if no delaying has taken place, but whether or not that is the case depends on what software release you are on.

 In 13.10, DelayTime in DBQLogTbl uses an integer data type, so any delaytime less than 1 second is reported as zero (not NULL).   Starting in 14.0 DelayTime has a data type of float format 9.999 so it will report very small delay times, to 3 decimal places.

 So if you are on 13.10 or earlier software and you see zero instead of NULL, there could be some small level of delay time that you are interpreting as parsing engine time.

Thanks, -Carrie

Hi Carrie,

Could you please help me , how the Teradata viewpoint is calculating "Responding state" time.

Can we do it from the dbql/some other source  after completing  the query/request ?

Teradata Employee

I am not familiar with "Responding State" as it appears in Viewpoint.  You would have to ask that question of someone who has more extensive knowledge of Viewpoint porlets.

In DBQogTbl, starting in 15.0, you will have both a FirstRespTime and a LastRespTime field.  You can subtract those two timestamps to determine how long the requests has been returning response packets. 

However, you must  be using DBQL Algorithm #3 to get a value in LastRespTime field, otherwise that field will be NULL.  

Starting in 14.10 you can enable DBQL Algorithm #3 by going to the General category, parameter #64.   Allgorithm #3 provides more accurate accounting, especially for queries with parallel steps, iterative queries, and aborted queries.  But it is not the default, because output from Algorithm #3 is not comparable with DBQL output from earlier algorithms.  You have to turn it on if you want to begin to use it.

Thanks, -Carrie