Tuning LIKE Operator

Database
Enthusiast

Tuning LIKE Operator

Hi 

I have a table of 100 GB with 80 columns in it,, it has around 300 million rows. I'm using like operator to fetch the rows (string entered from front end tool). Currently its taking around 10-20 secs to fetch the rows, I need it to bring down to 2-5 secs.

Stats are collected on those columns

I've tried creating Join index, Hash Index , secondary index  but nothing seems to be used when i look into the EXPLAIN Plan. It still goes for full table scan.

Following is the table structure

Create SET  table t1

(

id integer not null,

start_dt date not null,

end_dt date not null,

col1 VARCHAR(100),

col2 VARCHAR(100),

col3 VARCHAR(100),

.

.

) unique primary index(id,end_dt)

index(id);

I use the following select queries

Select * from t1 where col1 like '%test%'

Select * from t1 where col2 like '%test%'

Please suggest me in creating the proper hash index or join index which will improve my select performance. 

6 REPLIES
Senior Apprentice

Re: Tuning LIKE Operator

Did you check if the optimizer misses a statistic?

You can try a join index with the same PI as the base table and the column searched by LIKE:

create join index ... as
select id,end_dt, col1
from tab
primary index (id,end_dt)

The optimizer might do the LIKE on the JI and do a PI join to the base table.

Enthusiast

Re: Tuning LIKE Operator

I tried it,, the optimizer choses JI only when '=' is used on where columns, else it goes to full table scan.

Teradata Employee

Re: Tuning LIKE Operator

For any LIKE match that is not a left substring match (eg LIKE 'test%') the optimizer has no way to match the requested value against any statistics. Therefore regardless of any indexes that might exist the optimizer is going to choose a pessimistic selectivity for the LIKE clause which will cause it in turn to see the table scan as the lower cost alternative. You will see this in the explain in the estimate of the number of rows returned. For an arbitrary LIKE expression with no other qualifiers against other fields you will not be able to cause the optimizer to do other than the scan.

If the string search is not completely generalized then there may be something you can do to create some form of word index or string index like a search engine would do that would allow you to look up the string there and link back to the records you want to retrieve.

Senior Apprentice

Re: Tuning LIKE Operator

What's the selectivity of your LIKE-condition?

I just tried it on a TD15 VM and it's using a NUSI (or JI) if the number of distinct values is high (in my case 150,000 within 9,000,000 rows).

As Todd said, it still has to do a FTS on the NUSI/JI, but if this is small compared to the base table and  the estimated number of row is low it's still less work than a FTS on the base table.

Teradata Employee

Re: Tuning LIKE Operator

Try,

create another table (and mantain it, when loading t1)

create table aux_t1

(id integer not null,

end_dt date not null,

col1 VARCHAR(100),

col2 VARCHAR(100))

unique primary index(id,end_dt);

insert into aux_t1 select id, end_dt, col1, col2 from t1;

collect stats on aux_t1.index(id,end_dt);

(this new table is smaller)

and then transform SQL in

select t1.*

from t1 , aux_t1

where t1.id = aux_t1.id

and t1.end_dt =aux_t1.end_dt

and aux_t1.col1 like '%test%';

If it works, TD will do a full table scan of the smaller table  (much faster than do it on the big table) and then do a merge join with the big table.

This is what we do to avoid a full table scan of call data record, when we want to obtain a few records thru like condition,

eg. select * from cdr where IMEI like '%xxxxxxx%'.

Let me know if it works.

Best Regards

Gerardo Martinez

TD Spain

Enthusiast

Re: Tuning LIKE Operator

Thanks Everyone,

I've gone with Gerardo Martinez suggestion and it works,,