Avoid all rows scan on partial varchar access

Database
TK
Teradata Employee

Avoid all rows scan on partial varchar access

Hi to the performance experts among you.

We have a pretty large key table, with at least one VARCHAR(30) column storing the real world source key, here SOURCE_KEY_COLUMN1, and one bigint column storing the surrogate key (PI), here EVENT_ID.

CREATE MULTISET TABLE KEY_EVENT [...]

     (

      EVENT_ID BIGINT,

      SOURCE_KEY_COLUMN1 VARCHAR(30) CHARACTER SET UNICODE NOT CASESPECIFIC DEFAULT '',

      [...])

UNIQUE PRIMARY INDEX ( EVENT_ID );

Surrogate key is PI for best distribution of course and for good join access to the table.

But we have a lot of queries accessing the table with condition: POSITION('XYZ' in SOURCE_KEY_COLUMN1) > 0 or SOURCE_KEY_COLUMN1 LIKE 'XYZ%'.

Is there any way to index to avoid a full table scan on these partial string match types of access.

Thanks in advance

TK

2 REPLIES
N/A

Re: Avoid all rows scan on partial varchar access

Is there any rule how to extract info from that column like "check the first 3 chars"? Some kind of intelligent key compsed from subparts? Them splitting the key into those subparts and adding NUSIs for each column might help.

You probably tried a NUSI or Join Index already, but POSITION or LIKE is always tough for the optimizer, even if there are stats it's hard to get accurate estimations.

Or you might create a new table with columns event_id and source_key only (same UPI and much smaller than the original table). Then you can use LIKE on it and join back to the base table (fast AMP-local). In fact this could be done with a Calumnar table, too.

Re: Avoid all rows scan on partial varchar access

I feel that creating new table with event_id and extracts of event_id and source_key_column1 one time if it suits business requirements may help -with UPI event_id for joining back. But testing is required - collect stats, explain.