varchar length performance implications

Database
Enthusiast

varchar length performance implications

Hello,
I have 2 tables populated with the same data (2.7m rows), the only difference being the max length of the varchar columns and decimal columns.
given the data is the same, seems to me that aggregation performance should be the similar,
but I am seeing wildly different runtimes 5m vs 21min.
tablesize is 658m vs 636m, skew is same for both tables.

I've collected/refreshed stats of the same columns from the tables.

table differences
Table A Table B
ColumnName Type Length ColumnName Type Length
trip_type VARCHAR 4 trip_type VARCHAR 128
srch_hotel_text VARCHAR 200 srch_hotel_text VARCHAR 1,800
source_of_traffic DECIMAL 1 source_of_traffic DECIMAL 8
agency_code VARCHAR 40 agency_code VARCHAR 128
row length 821 2640

Any thoughts on why?
I want to keep the longer lengths, but not at 4x performance decrease.

query
EXPLAIN SELECT
x.SearchOptionsEventTimeID,
x.SearchOptionsEventTime,
x.VisitID,
x.VisitorID,
x.source_of_traffic,
x.lasthit,
x.srch_crusie_name,
x.trip_type,
x.srch_hotel_text,
x.AgencyCode,

MAX(
CASE WHEN x.Ranking = 1 THEN x.srch_hotel_chain
ELSE NULL
END ) SearchOption1,

MAX(
CASE WHEN x.Ranking = 2 THEN x.srch_hotel_chain
ELSE NULL
END ) SearchOption2,

MAX(
CASE WHEN x.Ranking = 3 THEN x.srch_hotel_chain
ELSE NULL
END ) SearchOption3

FROM

(

SELECT
SearchOptionsEventID,
SearchOptionsEventTimeID,
SearchOptionsEventTime,
SearchOptionsEventTimeGMT,
VisitID,
VisitorID,
source_of_traffic,
lasthit,
agency_code,
srch_crusie_name,
srch_hotel_chain,
trip_type,
srch_hotel_text,
AgencyCode,
RANK() OVER(PARTITION BY SearchOptionsEventTime,VisitID,visitorid,
source_of_traffic,lasthit,srch_crusie_name,trip_type,srch_hotel_text,
AgencyCode
ORDER BY SearchOptionsEventID) Ranking

FROM da_test.STG_WTCE17_SearchOptEvt_test

) x
GROUP BY 1,2,3,4,5,6,7,8,9,10
1 REPLY
Enthusiast

Re: varchar length performance implications

If the field 'srch_hotel_text' in TableB contains expanded values (lengths exceeding 200 characters) from TableA the fact that it is included in the Partition By of the RANK() function is the likely culprit for your performance difference.