Rejection CAN be Rewarding

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

It’s taken me a while to warm up to the idea that rejecting queries can be good. I’ve always subscribed to the hippocratic oath of workload management: “Above all, do no harm to a query.” Yet lately I’m seeing more and more Teradata DBAs grimly, yet firmly, embracing filter rules as a new weapon in their arsenal. And they’re starting to like it.

Where is this lethal trend coming from?

Not to leave any reader behind, a filter is a workload management rule that will evaluate and potentially reject a query before the query begins execution, based on specific characteristics of the query. The filter check is made after the optimizer builds the query plan but before the first query step is dispatched to the AMPs, so all the detail within the query plan is available to the code that enforces the filter. The tables to be accessed, what join geography will be used, and the time and row count estimates for the query are known by the filter rule.

Filters are easy to define. They can be used with or without TASM. As one example, you could create a filter to reject all queries that try to do a full table scan of your CDR table during 8 AM to 5 PM, Monday through Friday.

What I think has brought out this new ruthlessness on the part of some Teradata DBAs is the fairly recent ability to create a filter rule that will reject an unconstrained product join, but only when the join involves a very high number of rows (1 billion, for example). Defining a filter that combines the type of join with estimated row counts gives DBAs the ability to stop badly-written queries before they start eating up database resources, instead of after. It’s like pulling weeds out of your garden before they take root.

Plus, my sources tell me that non-trivial levels of CPU processing power are being conserved on their platforms since they activated such filter rules, and end-user queries are experiencing less contention for resources as a result. This means more conventional queries may be able to run and complete sooner, and a greater number of new applications might be supportable on the same hardware.

So, if like me, you squirm at the thought of having to make life or death decisions about a query before it gets a chance to run, get over it. The rewards may be larger than you think.

6 Comments
Enthusiast
What would be your suggestion to get to this point of enabling rejection...
1. Ensure estimation time is accurate from the explain plan.
2. Use the Estimation/Actual difference as a starting point for stats collection ..
Teradata Employee
I think your point is a good one. You have to look at the estimates that the optimizer is producing for the category of queries that you would like to reject, if you plan to base rejection on estimated processing time. This can be done with DBQLogTbl output.

I've also seen cases where rejection was based on other things, like estimated row count combined with a product join. The explain text can be helpful with determining how to set those thresholds.

It could be useful comparing estimated to actual seconds will be as a starting point for stats collections, but those two will always be different. In fact estimated time is not really the same as elapsed time seconds or CPU seconds, so the comparison would not be apples to apples. Usually looking at query selection columns and join columns as candidates for statistics collection is a better way to go.
Enthusiast

hi Carrie,

I wanted to ask the same question asked by AbeK but for different reason. We have a stats process owned by application dba and other teams ownership. Being a system dba, now a days i am noticing couple of important tables with SI and PI on it are not been refreshed for past 180 days (I have a sql to catch the tables that are not refreshing the stats for N number of days). 

We are on TASM 13.10, SLES10 system. I am thinking  to have an event in all states which can send a notification to the DBA on following criteria to catch the most critical tables in the system with missing stats OR not refreshing on regular basis.

 

1) Create a Period Event for 7 days a week. 

2) Under Notification section, under RunProgram have my sql to be run and send an email to DBA.

Do you think it is a good idea to make use of period events on stats perspective?

OR my 2nd thought is, have a filter and just create it under 'WarningOnly' mode and log the occurances if there is a huge difference in ESTIMATED and ACTUAL seocnds, then analyze and send the report to concern users on missing stats.

 

Appreciate your thoughts.

 

Teradata Employee

I believe you can use a period event type to do this.

I have not done this myself, but you should be able to set up a TASM event that triggers a program to run at specific times, by doing the following:

1.  In the State Matrix, click on the pencil (editing) inside the Planned Environment box

2.  When you see "Available Events"  hit the plus sign

3.  Select the Period Event Type and give the event a name

4.  Do not check "Create new Corresponding Planned Environment", unless you really want to create a new one, but you don't need to

5.  Select a day of the week (like every Sunday) you want this program to run

6.  Select as an action the program that you want to run

For this to work you must have already entered the name of the program you want to run in a particular location in Viewpoint.  That's a little tricky to do.   I have been told that the steps are documented in Chapter 2 of the Teradata Alerts Configuration Guide, so check that out.   If you get stuck, post a question on Teradata Forum, the Viewpoint section, and see if someone out there can help you.

In terms of using a filter in warning mode as you described below, that will not work.  Filters only support the standard TASM classification criteria in a straightforward way, and do not allow calculations across classification criteria as would be needed to determined the difference between estimated and actual time.  In addition, at the time a filter rule is applied the actual numbers of seconds is not known as the request has not executed.

You could use DBQLogTbl output after the fact to determine this, but not before the query executes.

Thanks, -Carrie

Hi Carrie,

Hope you are doing good. I would request your help here.

I am trying to avoid Full Table Scans on one of our big tables. I tried to create a filter, but it is working fine for the queries which are referring only this table. If we join with other lookup/dimention tables, it is throwing error (filter error) as it is seeing full table scan on those small tables.

Example:  Tablename: Sales  --- parition on 'txn_date'

Filter runs fine for below query:

sel * from Sales; --> this throws Filter error as expected.

sel* from Sales where txn_date = date-1; --> as expected, filter doesn't stop this query, this returns rows.

But, I am expecting below query also to be successful:

sel  d.dpt_id, s.total_sales

from Sales s

    ,dept  d

where s.dpt_id = s.dpt_id

and  txn_date = date-1

;

This filter is stopping this query also. Because, it is going with full table scan on table 'dept'. 

As a workaround I can put table 'dept' in 'exclude' objects list. But, this list will be very huge.

In future, if some new tables comes and joins with this big table, it will throw error again.

Please let me know if any way to achieve this.

Briefly, I don't want to allow any qery which is accessing table 'Sales' without condition on 'tdn_date', I don't bother about FTS on all other tables.

Appreciate your help.

Our viewpoint version: V14.10

Database Version: 14.00.07.15

Thanks,

Amar.

Teradata Employee

Amar, 

You need to add the table as target selection criteria first, then specify no table scans as subcriteria to that table:

If you want the filter rule to apply only to full table scans on Table X, then add Table X as a target object.   After you add the table you will see a tiny pencil to the right of the table name under the "Included" label.  It is next to the trash can.   Hit that pencil and you will get a screen where you can add subcriteria, which will only apply to that table.  If you click on "Full Table Scan" option, and then under it click on "Include" you will ensure that only full table scans against Table X will be rejected. Table scans against other tables will not be rejected.

Thanks, -Carrie