Alias performance

Database
Enthusiast

Alias performance

Hi, I am new to SQL and to Teradata so I am sorry if this is a stupid question. I studied real-time programming in school using C. Recently I saw a SQL statement in a BTEQ that didn't sit right with me. Unfortunately I don't have access to Teradata at work. I will have to wait until I get home to play with it but I can't find any fault in my logic. So here goes.

The purpose of the BTEQ is to extract data and it uses a select statement. It extracts a lot of columns from many tables in a large database, so I think performance is an issue that should be considered. Just a short example would be:

select col1, col2, case
when substr(col3,1,2) in ('AA','BB','CC') then 1
when substr(col3,1,2) in ('AB','BC','CD') then 2
when substr(col3,1,2) in ('AC','BD','CE') then 3
from table1
where ...

The thing that I don't like is that it appears to call substr 3 times in my example for the millions of rows in the table. In the real script it is about 10 times and a similar proccess is repeated for a few columns.

So, finally my question, what is the performance of an alias? How is it handled? Here is my idea:

select col1, col2, case substr(col3,1,2) as temp
when temp in ('AA','BB','CC') then 1
when temp in ('AB','BC','CD') then 2
when temp in ('AC','BD','CE') then 3
from table 1
where ...

I did some research about case statement. Apparently it can act like a case in C that I am used to where you have a variable and check if it is equal to things; or in SQL it can evaluate a boolean expression every time.

So, is it possible to only call substr once and store it in a place holder? Or am I thinking to much in C and an SQL alias is just to pretty up the code but only replaces the alias when it is parsed? Or maybe it is smart enough that it only executes the substr once in the original statement, like how you can't reference column aliases in group by.

All I can find online about aliases is that they help clean up the code by not making you type everything out repeatedly, or rename a column when it is output.

I think I have said enough. Thanks for the help.
Tags (3)
3 REPLIES
Enthusiast

Re: Alias performance

Sorry, I meant HAVING, not GROUP BY.

if I did

select col1, count(col2) as total
from table1
GROUP BY col1
HAVING count(col2) > 1

I have to use count(col2) instead of the alias total; and I am pretty sure it doesn't do the count again in the HAVING clause.
Junior Contributor

Re: Alias performance

The optimizer should be smart enough not to re-calculate the same substring repeatedly.
If you look at explain you will see (on TD12) that the optimizer just replaces the alias with the base calculation, so an alias is just conveniant, nothing else than a search&replace done in a text editot.

In Teradata an alias can be used in any place, so "total" can be used in HAVING, but this is an SQL extension which is AFAIK only available in Access, too.

Your "idea" query is not valid SQL but using an old Teradata extension you could write:
case
when substr(col3,1,2) (named temp) in ('AA','BB','CC') then 1
when temp in ('AB','BC','CD') then 2
when temp in ('AC','BD','CE') then 3

According to ANSI SQL you need a Derived Table:
select
case
when temp in ('AA','BB','CC') then 1
when temp in ('AB','BC','CD') then 2
when temp in ('AC','BD','CE') then 3
...
from (select substr(col3,1,2) as temp from ...) as dt

The optimizer will probably resolve that nested statement and procude the same plan.

Dieter
Enthusiast

Re: Alias performance

Hi Dieter,

Could you please explain the how optimiser resolves the alias names and whats the use of it?

We have experienced product joins coz of not using proper alias names..Can we use the alias name as a table name?Please explain in detail