I have a similar question, though maybe more detailed.
Given that I can collect CPU, IO, and Spool utilization for 2 queries, how do I combine these numbers to come up with a meaningful comparison?
Result in essentially the same elapsed time on a quiscent system, but have different values for these metrics, how do I determine which is more efficenti overall?
In my case, our system seems to be IO bound, so are IO points worth more in determine some sort of overall score?
It is as you stated - the query cost depend on your workload and system. In an IO bound system you would rate IO higher than in a CPU bound system. But I think there is no golden rule for all systems.
I usually check
and ellapsed time - but this will depend heavilly on concurent workload and need further consideration.
Also check that you run queries multiple times and in different order - as you might also face differences due to caching.
The skew can be very important - sometimes a query which is using more CPU but has a much lower CPU skew can be better for the whole system.
Instead of CPU skeq you can check impact CPU = max_vproc_CPU * number of Vprocs.
Well Everyone has their own way to look into parameters that give meanigful information for their systems. But one thing that I always suggest at customer sites is not judge the performance of the query by elapsed time. It is so misleading as it depends on various parameters of the system like concurrency, workloads running at that time, blockings, run away queries and the load on the system. what is consistent no matter when you run the query is the query execution path, resources used and its break down.
Based on the type of bound look for either CPU related or IO related consumptions. SpoolUsage, Impact CPU, UII. These are the parameters that you want to look with the assumption that your query good enough or followed right coding standards.
Won't argue against your statement on elapsed time in general
Where it becomes an issue is if elapsed time is the one where SLAs are defined on. Tactical queries can be of this sort. And here sometimes (rare, rare, rare) more CPU can be faster. Here the TSAM settings / assignments will be the key beside the SQL.
The other thing is also that sometimes CPU figures are not showing the truth. See http://forums.teradata.com/forum/teradata-masters/java-udf-cpu-not-tracked-in-dbql
So keeping an eye on it and checking if elapsed time is inline with the expectation shouldn't be an mistake. In 99% of the time all other KPIs will give you more and the right insight of the query difference.
Vasu, what is UII? I am not familiar with that.
I will start considering skew as well.
To Ulrich's point, we do have SLAs defined on elapsed time, which is problematic due to the reasons given. I can have many very efficient queries and still miss my SLAs. I am trying to get to something more empirically based, thus my desire to be able to "score" queries in a consistent and easily compared manner.
ANy insight provided on who to use these other metrics is appreciated.
I've been avoiding spool only becuase it doesn't seem to offer more in detemrining perofrmance than IO does. As spool increases, so does IO. maybe I am missing something here. Feel free to enlighten me.
And thanks again for the input.
As you have SLA on ellapsed time you need to sit together with the workload management people as this is the main area where you have influance on ellapsed time - assuming that you developed the possible SQL / process.
Could anyone give me an example of how the Cost of Query is calculated , with a simple Query.