Initial outcome on Extrapolated Stats at TD12

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Enthusiast

Our Operations and Development teams finished the TD 12 upgrades of our two production systems several weeks back.  We have been wanting TD 12 to get to the extrapolated stats available so we could stop "munging" stats.  Our experience is that but for the problems with wide systems (below), Extrapolated Stats work, customers not in the "large" category should taking advantage and dialing back the frequency of their stats collection, particularly for PPI or Index date fields.

Essentially, our experience is that it works really well, but for a couple of edge cases.  Part of the Extrapolation algorythm uses amp sampling, which given our twin ~2000 vproc systems, has the potential to produce inconsistent results.  The one metric we could measure was that product joins increased 10%, not a complete picture, but certainly a partial.  There IS a minor work around by increasing the number of amps sampled, but given the extreme workload and dictionary work on our platforms, to do so would introduce PE locking, which is very difficult to measure.  We had to return to the old ways with stats...

The plug for ExtrapStats aside, I still think we're not addressing the basic problem of stats collection management.  We use 8% of our system collecting stats (6 Nodes of our 78 active node system!!!), which is about 3x more than it should be, at least I would assert we should not spend more than 2%... While there is a Stats Wizard, and an Index Wizard, I do not feel these are complimentary tools for the 5000 Entity production system with several million queries a day, 80% distinct. 

The basics of the problem is that today Stats collection is entirely reactive, and not predictive.  If we wanted stats to be efficient, we would collect them when we could predict with a reasonable confidence - or maybe even actual metrics - that the plan for a given query would change.  Impossible you say - 80% of our 1M queries/day are distinct - you could not track all the states... True, but I also think it is reasonable that we can cluster the joins and search conditions into a reasonable set of 300 to 500 clusters - and predict when those will change plan.

This is a big undertaking. First, we would have to be able to cluster the queries by join and search conditions. Which means we need to be able to parse up all that SQL, and then classify it with a model. We would have to establish baseline query plans for each cluster, and then test the ranges of search condition combinations against the cluster to establish which column or column combinations and value ranges would result in a changed plan...

Yeah - that's a "big money" dream all right... I'm starting to think i've been mulling about this problem far, far too long...