To Peek or Not to Peek, That is the Question

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Teradata Employee

To Peek or Not to Peek, That is the Question

Cached plans are good. For repetitive queries, cached plans remove the overhead of parsing and optimizing, and this helps short work perform more consistently. But sometimes cached plans, being general-purpose and one-size-fits-all, are not able to take advantage of specific optimizations that could help selective executions of the query if only the optimizer could see those parameterized SQL values. New in Teradata 12, the optimizer has the opportunity, under specific conditions, to peek at values in a parameterized SQL request. This can lead to better plans for some queries.  Here’s when it happens and how it works.

Sometimes, by incorporating specific values into the production of a parameterized query plan, better plan decisions can be made, such as which of several tables to join first, or what type of access to use for a table. By the way, this peeking approach only applies to parameterized requests of some complexity. Simple requests, with equality on a primary index, a unique primary index, or a partitioning column, will never be subject to peeking and will have the same cache flag values as they always have after the plan has been cached, ‘T’.

How Peeking Works

So, assuming your parameterized request is not such a simple equality request, here’s how peeking works. The first time a parameterized request is seen, the parser peeks at the values in the USING clause, and a plan specific to those values is produced.

This will cause the cache flag for that query’s execution to be set to ‘S’ for ‘specific plan’. At this point in time, no plan is actually cached for re-use, because a value-specific plan was produced. A value-specific plan may be different for each different execution of the same parameterized request, and will only be cached when the exact request, including values, is seen for the second time.

After this first ‘specific plan’ execution is complete, metrics from the execution, as well as the estimates that were produced, are saved in the request cache, in preparation for additional decisions that will be made should this request be seen again. If the parsing engine (PE) time that resulted from the specific plan was a very small percent of the total query execution time, then the optimizer may set a flag to always use a specific plan for this query. If that decision is made, then the cache flag going forward will contain an ‘A’ for ‘always specific’ and no generic plans will be generated for this request as long as its metrics remain in the request cache.

On the other hand, if the PE time that resulted from the specific plan is non-trivial, the second time that parameterized request is seen a generic plan is produced resulting in a cache flag ‘G’. Once both a specific and a generic plan have been generated, the estimates and run-time metrics they each produced can be compared side by side, and further decisions can be made whether to always produce the specific or always produce the generic plan.

A Common Pattern

A common pattern I often see, assuming the same parameterized request is repetitively executed on the same PE, is a cache flag sequence of ‘S’, ‘G’, ‘T’, ‘T’, ‘T’… The first time the query is seen, a specific plan is produced (‘S’), the second time a generic plan (‘G’), and from that point on, the same generic plan is executed (‘T’) until that plan is flushed from the cache.

Examples

Two different examples of how cache flag values can change in Teradata Database 12.0 are shown below.

Parameterized SQL:

This first example of selected DBQLogTbl table output illustrates rows executing the same parameterized request, sorted by start date and start time. Notice the different values for the cache flag, and the different start date and times that are reported.

The 1st row represents a parameterized query that is seen for the first time by this PE. Therefore the values in the USING clause are peeked at, a specific plan is built, and an ‘S’ appears in the cache flag column.

The 2nd row represents the second time the same parameterized request is seen, and because the parsing time used to generate the specific plan was not considered low enough to discount producing a generic plan, the generic plan is produced. Note that this results in a similar parsing time (0.06 seconds) as shown in the 1st row. Now the cache flag reads ‘G’ for generic.

The 3rd row represents the query seen for the 3rd time within a short period of time on the same PE. In this case the ‘T’ in cache flag indicates that this query used the generic cached plan that had been put in the cache by the 2nd query execution. Note that the parsing time is now zero, which is what you would expect when the plan that was used came from the request cache.

The 4th row in the DBQL output is identical to the 3rd row with a cache flag of ‘T”, which indicates that the plan is already in the cache.

With the 5th row, the same query could have continued to use the cached plan, except that so much time has passed (16 days) that the cached plan on the PE has been surely flushed. The process of specific plan, followed by generic plan will begin all over. The request cache on each AMP is flushed every 4 hours.

The 6th row is the same as the 5th, with the passage of time causing the metrics concerning the previous specific plan to be flushed from the cache.

Non-Parameterized SQL:

In the following example, a non-parameterized request is repeated 5 times with only 3-4 seconds between each execution. For this query a blank or a ‘T’ shows up in the cache flag column:

Because the request does not contain parameterized SQL and there is no USING clause, as the query above had, no peeking can take place and the ‘S’ or a ‘G’ in the cache flag column will never appear. A decision to use a specific or a generic plan is never made under these conditions. However, what this example is illustrating is that a non-parameterized query will have its plan cached when it is seen the 2nd time. This means that the 3rd such query will be able to take advantage of a cached plan, even though the query does not contain parameterized SQL.

In the first two entries above, the query is parsed, and the remaining 3 executions of that query use the cached plan, as indicated by the ‘T’ in the cache plan column.

Tags (2)
14 REPLIES
Enthusiast

Re: To Peek or Not to Peek, That is the Question

Hi Carrie,

In case of Non-Parameterized SQL, when a plan has been generated and is being used for its subsequent runs, what happens when the values supplied for WHERE clause have skewness between different runs?

1. Would it use the same plan or would it generate a new one after executing the query and comparing the elapsed timings between 2 runs?

2. Is there a DB setting where in all Non-Parameterized SQL is forced to convert literals into bind variables?

Thanks
Teradata Employee

Re: To Peek or Not to Peek, That is the Question

Non-parameterized SQL will not generate a generic plan, re-usable and will not undergo peeking. That is because there are no values to peek at unless parameterized SQL has been used.

Non-parameterized SQL will always build a specific plan. It is possible that the plan could be re-used (even though it is not strictly "re-usable", resulting in a cache flag value of 'T'. That would only happen if the exact same SQL with all the same values were submitted. Bu that is not the same thing as re-using a generic plan, which can also give you a cache flag value of 'T'. In the case of parameterized SQL you will have a cache flag of 'G' first, indicating that there was a generic plan built. With non-parameterized SQL you will never get a cache flag of 'G', only spaces or 'T'.

There is no setting that I am aware of that will cause literals to be anything other than literals.

Thanks, -Carrie

Enthusiast

Re: To Peek or Not to Peek, That is the Question

If one has parameterized SQL and 'S' CacheFlag is associated with better executions, what options exist to not get the lesser 'T' and 'G' plans for a given userID? To not prepare the SQL would require some level of rewrite (and I'm told it is standard practice to prepare the SQL in Java).
Thank you for the explanations above; they are very clear and helpful.
Richard
Teradata Employee

Re: To Peek or Not to Peek, That is the Question

Richard,

You will see A (Always specific) as the cache flag for a query if the optimizer determines through its previously saved metrics that the specific plan (S the first time) produces a better overall execution than the generic plan (G). Once you get an A cache flag for the query, as as long as the metrics behind that query plan remain in the request cache you will continue to get an A for the query and continue to get specific plans generated for that query.

That sequence of events will change once the query plan and any associated query plan metrics have been flushed from the request cache. At that point you will start the sequence all over, S, G, followed by (A or T) repetitively. Most queries will be flushed from the request cache at a minimum every 4 hours, of whenever a DDL is issued that effects one of the tables with the query, or the table header for one of the tables in the query is modified.

You will usually always get one S and one G for a properly parameterized query, after which time the optimizer makes a decision about which to go with. So there is no mechanism available to suppress at least one generic plan from being produced. However, if you think the optimizer has made the wrong decision by continuing on with generic plans (T) when the specific plan (A) would have been the better choice, please open an incident with the support center. There may be some areas in the peeking algorithms that need to be tightened up, and we certainly want to work with you on that if there are opportunities to improve the feature.

Thanks, -Carrie

Re: To Peek or Not to Peek, That is the Question

Hello Carrie:

We have a parameterized query that has a much higher performance with Specific Plan (S) than using Generic Plan (G) Is there any way of tell the Plan Engine of Teradata to use always Specific Plans?

Thanks for your attention!

Juan Antonio

Teradata Employee

Re: To Peek or Not to Peek, That is the Question

If you don't want generic plans to be used, the easiest thing to do is to remove the parameterization from the request.  That will ensure you always get a specific plan.  

There are some cases where you ALWAYS want specific plans, such as when you are passing parameters for a range constraint against a PPI table, or if you want to use a sparse join index and you have parameterized the sparse-defining column value.  In those cases it is recommended that you not use parameterized SQL.

If you believe this feature is not working appropriately, please open an incident with the support center so it can be improved based on your experience.

Thanks, -Carrie

Enthusiast

Re: To Peek or Not to Peek, That is the Question

Hi Carrie,

In the first two entries, cache flag is empty. Whether this means that these queries needs to be  reviewed to determine additional optimization? Please advise.

Thanks in advance.

Teradata Employee

Re: To Peek or Not to Peek, That is the Question

The first two entries are not cached because the SQL does not contain parameterized SQL.

As it says in the text below the table you are referring to:  "What this example is illustrating is that a non-parameterized query will have its plan cached when it is seen the 2nd time. This means that the 3rd such query will be able to take advantage of a cached plan, even though the query does not contain parameterized SQL."

That is why there are two executions that do not use cached plan, and only the third does.  The query is not parameterized.

Thanks, -Carrie

Enthusiast

Re: To Peek or Not to Peek, That is the Question

Thanks for your response Carrie,

Whether the Cache flag has anything to do with query optimization? Say, if it is empty then the query has to be optimized?

Please advise.