Hi to the performance experts among you.
We have a pretty large key table, with at least one VARCHAR(30) column storing the real world source key, here SOURCE_KEY_COLUMN1, and one bigint column storing the surrogate key (PI), here EVENT_ID.
CREATE MULTISET TABLE KEY_EVENT [...]
SOURCE_KEY_COLUMN1 VARCHAR(30) CHARACTER SET UNICODE NOT CASESPECIFIC DEFAULT '',
UNIQUE PRIMARY INDEX ( EVENT_ID );
Surrogate key is PI for best distribution of course and for good join access to the table.
But we have a lot of queries accessing the table with condition: POSITION('XYZ' in SOURCE_KEY_COLUMN1) > 0 or SOURCE_KEY_COLUMN1 LIKE 'XYZ%'.
Is there any way to index to avoid a full table scan on these partial string match types of access.
Thanks in advance
Is there any rule how to extract info from that column like "check the first 3 chars"? Some kind of intelligent key compsed from subparts? Them splitting the key into those subparts and adding NUSIs for each column might help.
You probably tried a NUSI or Join Index already, but POSITION or LIKE is always tough for the optimizer, even if there are stats it's hard to get accurate estimations.
Or you might create a new table with columns event_id and source_key only (same UPI and much smaller than the original table). Then you can use LIKE on it and join back to the base table (fast AMP-local). In fact this could be done with a Calumnar table, too.
I feel that creating new table with event_id and extracts of event_id and source_key_column1 one time if it suits business requirements may help -with UPI event_id for joining back. But testing is required - collect stats, explain.