Skew-Sensitivity in the Optimizer: New in Teradata 12

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Teradata Employee

Skew-Sensitivity in the Optimizer: New in Teradata 12

Have you ever had what looks like a great query plan turn into life in the slow lane at run time?

You may have blamed it on your table having very skewed values in the join column. And you were probably right. But the optimizer is getting better at recognizing skew, starting in Teradata 12. This skew detection happens WHILE it’s building up the query plan, not after the query begins to execute. One initial step towards this but-what-will-it-be-like-at-run-time smarts happens when the optimizer is pondering a row redistribution step for a large table in preparation for a join. I thought you might like to see an illustration of how this works.

By identifying skew before the row redistribution is already nailed into the plan, the optimizer has a chance to consider other join options. When skew is detected in the join column of a table-to-be-redistributed, the estimated cost of row redistribution will be adjusted upwards, causing the optimizer to see it as a more expensive option (and let’s face it, with skew present, it may be).

I’ve played around with this on my Teradata 12 system, and like how it works. So if you want to peek over my shoulder for a moment, I’ll share my findings with you.

Yes, You’ll Need Statistics Collected on that Skewed Join Column

Each detailed interval in the statistics histogram has a field called Mode Frequency. Mode frequency is a count of the most popular value in that interval. In interval zero (which contains mostly summary information), the mode frequency field represents a count of the most popular value across all the detail intervals. It is the most highly-skewed value in the table.

Here’s a subset of columns and a few intervals from a histogram for a column. This histogram was built after I updated the table in a way that resulted in about 5 million rows sharing the same value in this column. In other words, I consciously introduced exaggerated skew into the mix. 

 

Interval 1 (the 2nd row) contains the most frequent value for this column, 5. There are 5,001,159 rows in this table that carry the value 5 in this column, out of about 45 million rows in the table. The non-skewed values all have 19 or 20 rows per value.

Because 5 is the most frequent value for this column in the table, it ends up in the Mode Value and its row count in the Mode Frequency column of Interval 0 (the 1st row). Interval 0’s mode frequency is referred to as the “high mode frequency” because it represents the most frequent value across the entire table.

The histogram information is used to calculate the highest number of rows that are likely to go to one AMP for the redistribution and join activity. The optimizer uses the number of AMPs (40 in this case) and this high mode frequency to make a worst-case assessment of the cost of row redistribution.

The Optimizer Rethinks it Estimates in the Face of Skew

Without this new skew awareness, the optimizer in this case would have assumed that each AMP will do the work of joining about 1.1 million rows (44,991,725 rows / 40 AMPs). However, with this new intelligence, the optimizer understands that all the rows that carry the high mode frequency value will be redistributed to one AMP, and that one AMP will be doing a disproportionate amount of the work.

Once it assesses what the worst-case AMP is likely to be doing, then the optimizer further assumes that the remaining rows will be split evenly across all AMPs. Let’s break that math apart.

Considering skew, the worst-case AMP will get 5,001,159 rows (the high mode frequency count). That same AMP will also get an additional 999,764 rows (the number of rows that carry the non-skewed values divided by the number of AMPs – 39,990,566 / 40 = 999,764). This results in an estimate of slightly over 6 million rows targeting the worst-case AMP, more than 5 times larger than what the optimizer estimates would have been in the blind-to-skew case.

Instead of assuming each AMP will receive 1.1 millions rows during row redistribution, the Teradata 12 optimizer assumes a worst-case redistribution of 6 million rows for at least one AMP. It uses that worst-case number of rows to adjust the estimated cost of the row redistribution choice.

My Experiment

So here’s what I did, graphically shown below.  I started with a baseline join between the Lineitem and the Parttbl tables where there was no skew on L_Partkey (the larger table’s join column). This particular join required that both the Lineitem and the Parttbl be redistributed, as in my database neither side of the join was the table’s primary index column. 

Next, I changed the underlying values contained in the Lineitem table join column, L_Partkey, to gradually introduce skew. After each such skew-increasing effort, I made sure to recollect statistics on L_Partkey, so the optimizer would have the information needed to detect the creeping skew. Then I ran an explain to see what plan I got after increasing skew and collecting stats, hoping to see a change in join geography, that would avoid a redistribution of the Lineitem table. 

The following table shows my results from this experiment, where I went from no skew (high mode frequency of 118 rows) up to moderately strong skew (high mode frequency of over 100,000 rows). In this table I capture the high mode frequency values taken from the histogram for each test (remember, I recollected stats after each change in the skew). I also capture the estimated processing time that showed up for the row redistribution step in each query plan. 

As can be seen in the table above, the join geography changed when the high mode frequency reached over 100,000 rows for L_Partkey. Seeing those changed demographics, the optimizer chose to leave the large table local, and duplicate the smaller table.

Notice that the estimated processing time taken from the explain text increases a small amount as skew is gradually introduced, even though the join geography in the plan does not change for the small skew and the smallish skew cases. That increase in the estimated processing time is evidence that skew detection is working. The estimated cost of performing the row redistribution increases gradually, until such time as the duplication of the smaller table becomes a more cost-effective choice.

Conclusion

Of course, optimizer decisions such as changing join geography depend on many factors, including processing power of the platform, number of rows in the tables on each side of the join, and other demographic information available to the optimizer. And the costing performed by the optimizer is always taken down to the AMP level, so the number of AMPs in the configuration can make a difference in the plan you get. This is a simple example to step you through the process.

Just a reminder, if there are multiple join predicates, multi-column statistics may be required.

What I have demonstrated to you is that skew-awareness is alive and well in the Teradata optimizer, and it’s here to stay. So your life as a Teradata DBA just got a little more boring.

29 REPLIES

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

in such case, as you reminder, whether the smaller table duplicated depend on many factors. sometimes the optimizer will not duplicate the smaller table even this will be much lower cost consumption.

Any "tricky" way to force the smaller table duplicate?

Teradata Employee

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

I would not suggest that you attempt to force table duplication. It is better to leave join geography and join sequence decisions to the optimizer. But what you can do is make sure that you have provided adequate statistics information, so the best join plans can be produced.

With small tables in particular, it is critical that statistics be thoroughly collected and kept current. Random AMP sampling does not provide as good default statistics for very small tables as it does for moderate or larger tables. If you are not getting table duplication when you think you should, then examine what statistics you are collecting on the table, and make sure you have covered all columns on the table that might be contributing to a better table estimate. The optimizers costing algorithms rely on those statistics in making the decision to duplicate or not duplicate a table.
N/A

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

I am having very similar issue, the bigger table with more than 250 million rows and tiny table with about 2 k rows, optimiser wants to redistibute both tables :( Problem is I do not have controll over the big table and cannot collect stats on joined column. I was wondering if you could suggest something that would force duplication of smaller table? I know it would be better to leave it with the optimiser but in this case its generates really bad explain.

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

How can i see the statistical column histogram data that you have shown above? I'm guessing it's somewhere in the dbc -- but i have no idea where.

Also, how has skew awareness advanced in Teradata 13? (& 13.10?)

What are the best ways to identify what's causing a query to spool out? I'm having a challenge now with a large query that ran fine last month, but now only spools out. if you give it enought spool (2 terabyes), it will run but will take an extremely long time. I'm struggling trying to figure out what specific piece (or pieces) of the data have changed and are causing the issue. Additionally, i don't have a previous copy (or history) of the data, so... it's difficult to see what's changed. Do you have any ideas or pointers that you can offer?

many thanks in advance.
Teradata Employee

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

In response to Quonos...

I don't know of any method of forcing a table duplication plan that is different from what the optimizer produces. The key demographics used by the optimizer come from collected statistics. Not having statistics collected on the join column is most likely the source of the bad plan, so it's pretty important to get that addressed at your site. If you don't have the priveleges, then find out who does and have them collect the stats. Or they can put the collect statistics statement for the join column in a macro , and then give you exec privileges on the macro.

In some cases, redistributing the large table makes the join operation run quicker. If the small table is duplicated without touching the large table, only product/dynamic hash joins are possible which may not be the optimal join for this query. All the operations involved (not just the redistribution) are costed to make the final decision. The accuracy of the decision depends on the estimates the optimizer makes, and the optimizer relies on appropriate statistics to make good estimates. In addition, as the posting explains, skew detection relies on statistics as well.

Thanks, -Carrie
Teradata Employee

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

In response to Chusa01...

To see the detail inside the histogram you issue this command:

HELP STATISTICS table-name COLUMN column-name;

You must have some access right (any access right) on the table in order to view the histogram detail.

There are no major changes to skew awareness in T13 or in T13.10.

In terms of what is causing spooling aborts, it could be that the data volume has increased, causing more rows to be selected on tables that feed into the large spool file, or more row are being joined before the spool file has been created. It is also possible that selection criteria on some of the tables feeding the spool file have changed; for example, a user is selecting a larger range of dates, or maybe is doing analysis on the entire year of 2010 and since it's near the end of the year, there's just more data involved in performing that analysis.

It is also possible that selection criteria is selecting data values that are producing more skew. The spool allocation for a user is divided by the number of AMPs in the configuration. If any one AMP exceeds its portion of the spool allocation, the query will get an error. Skew is one of the biggest reasons that queries run out of spool.

If you aren’t doing it now, it won’t help you with this query but in the future, collecting DBQL or EXPLAIN information on non-tactical queries would provide you with the opportunity to go look at previous query plans and compare them to the current query plan. This may provide a clue to identifying the problem.

Thanks, -Carrie

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

Hi Carrie,
I have a query skew of 500.The step which is taking all cpu time has join between two tables(PI column join).The tables are very well distributed.(almost 0 skewness).Stats are not updated,yes there is high gap in estimated row count and actual row count in dbqlstep table for this step.
But data is not redistributed or duplicated.How in this situation also stas colect help in query skewness.
Teradata Employee

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

Without stats collected on the join columns, the optimizer assumes that the table with a small number of distinct values on the join columns will have a complete match. If there is not a complete match to that table, then the estimated rows could be quite different from the actual rows in that step.

Another reason there could be a gap between actual and estimated has to do with the WHERE clause selection that the query might specify for each table. The optimizer will have to make a decision whether to apply the selection criteria before or after the join. Without statistics collected on the join column, and also on the selection columns, this costing may be inaccurate, and a less expensive join order might be missed.

For those reasons, it is probably a good idea to collect statistics on the join columns in this case. The more information the optimizer has, the better it can help to find a better plan.

Re: Skew-Sensitivity in the Optimizer: New in Teradata 12

Thanks Carrie...but how to justify skewness because of missing stats...i got it that its good to have stas on join columns(even for PI join) but how missing stas contributes to query skew...still now very clear..could you please help...