By any chance is there any possibility of getting into spool space issue while collecting the stats?

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Enthusiast

By any chance is there any possibility of getting into spool space issue while collecting the stats?

I was querying some big table having 220 Million rows like follows and these queries were failing due to spool space issue

 

SELECT DISTINCT COL_1 FROM TABLE_1;

OR

SELECT COL_1 FROM TABLE_1 GROUP BY COL1;

Some points for this table are

  1. COL_1 is not PI.
  2. COL_1 is CHAR(5) NOT NULL and is non indexed column
  3. Stats are collected on PI column only and that too long back (at least 2 year old)
  4. COL_1 is having same value for all rows
  5. Rows are evenly distributed for this table

 

When I collected the stats on COL_1 queries got through and I am getting the result. On this scenario I got some questions

  1. While collecting stats PE will have to read all the rows for this column (all-rows scan)
  2. There will be sorting of data as PE can also go for Aggregation

For my queries PE was falling in spool space issue but it didn’t when collecting stats.

So question is, through stat collection I am asking for same operations on data but why I am not getting in to spool space issue?

8 REPLIES
Teradata Employee

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

Please provide Teradata database version and explains for the queries.

There are different ways to do group by and the optimizer chooses among them. Some choices group locally first, some group globally. Without stats it is likely to make different decisions than when it knows there is only one value.

Collect stats always groups locally which is good for very non-unique cases like this one. But collect requires spool to execute so it is certainly possible to run out of spool for collect as well. It operates differently than count distinct though because it is building a histogram in addition to identifying the most frequent values.

Note that none of this work is done in the PE. All scan, spool, aggregation, collect stats operations are done entirely in the AMPs.
Enthusiast

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

Thanks ToddAWalter for reply and correcting me. :)

Teradata database version:
VERSION 15.10.03.02
RELEASE 15.10.03.02

Explains for the queries:
Explain 1:
Explain SEL DISTINCT(COL_1) FROM TABLE_1 ;

1) First, we lock TABLE_1 for read on a
reserved RowHash to prevent global deadlock.
2) Next, we lock TABLE_1 for read.
3) We do an all-AMPs RETRIEVE step from
TABLE_1 by way of an all-rows scan
with no residual conditions into Spool 1 (group_amps), which is
redistributed by the hash code of (
TABLE_1.COL_1) to all AMPs.
Then we do a SORT to order Spool 1 by the sort key in spool field1
eliminating duplicate rows. The size of Spool 1 is estimated with
no confidence to be 163,068,864 rows (7,011,961,152 bytes). The
estimated time for this step is 27.72 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 27.72 seconds.

Explain 2:
Explain SELECT COL_1 FROM TABLE_1 GROUP BY COL_1;

1) First, we lock TABLE_1 for read on a
reserved RowHash to prevent global deadlock.
2) Next, we lock TABLE_1 for read.
3) We do an all-AMPs RETRIEVE step from
TABLE_1 by way of an all-rows scan
with no residual conditions into Spool 1 (group_amps), which is
redistributed by the hash code of (
TABLE_1.COL_1) to all AMPs.
Then we do a SORT to order Spool 1 by the sort key in spool field1
eliminating duplicate rows. The size of Spool 1 is estimated with
no confidence to be 163,068,864 rows (7,011,961,152 bytes). The
estimated time for this step is 27.72 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 27.72 seconds.

Explain 3:
Explain COLLECT STATISTICS
COLUMN ( COL_1 )
ON TABLE_1;

1) First, we lock TABLE_1 for access.
2) Next, we do an all-AMPs SUM step to aggregate from
TABLE_1 by way of an all-rows scan
with no residual conditions , grouping by field1 (
TABLE_1.COL_1). Aggregate
Intermediate Results are computed globally, then placed in Spool 3.
The size of Spool 3 is estimated with no confidence to be
163,068,864 rows (4,728,997,056 bytes). The estimated time for
this step is 41.80 seconds.
3) Then we save the UPDATED STATISTICS for ('COL_1 ') from
Spool 3 (Last Use) into Spool 5, which is built locally on a
single AMP derived from the hash of the table id.
4) We compute the table-level summary statistics from spool 5 and
save them into Spool 6, which is built locally on a single AMP
derived from the hash of the table id.
5) We lock StatsTbl for write on a RowHash.
6) We do a single-AMP ABORT test from StatsTbl by way of the
primary index "{LeftTable}.Field_2 = 'A906AB1D0000'XB" with a
residual condition of ("(StatsTbl.ExpressionList =
'COL_1 ') OR (StatsTbl.StatsId = 2)").
7) We do a single-AMP MERGE into StatsTbl from Spool 5 (Last Use).
8) We do a Single AMP MERGE Update to StatsTbl from Spool 6 (Last
Use) by way of a RowHash match scan.
9) We spoil the statistics cache for the table, view or query.
10) We spoil the parser's dictionary cache for the table.
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.

 

 

Explain 1 and Explain 2 are almost identical.
All the explain are with no confidence and will be full table scan.
All explains will need sorting (Explain 1 and 2 are mentioning it explicitly and explain 3 it is about aggregation so implicitly it is talking about sorting)

Enthusiast

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

Thanks ToddAWalter for reply and correcting me.

Cersion Info:
VERSION 15.10.03.02
RELEASE 15.10.03.02

Explain 1:
Explain SEL DISTINCT(COL_1) FROM TABLE_1 ;

1) First, we lock TABLE_1 for read on a
reserved RowHash to prevent global deadlock.
2) Next, we lock TABLE_1 for read.
3) We do an all-AMPs RETRIEVE step from
TABLE_1 by way of an all-rows scan
with no residual conditions into Spool 1 (group_amps), which is
redistributed by the hash code of (
TABLE_1.COL_1) to all AMPs.
Then we do a SORT to order Spool 1 by the sort key in spool field1
eliminating duplicate rows. The size of Spool 1 is estimated with
no confidence to be 163,068,864 rows (7,011,961,152 bytes). The
estimated time for this step is 27.72 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 27.72 seconds.

Explain 2:
Explain SELECT COL_1 FROM TABLE_1 GROUP BY COL_1;

1) First, we lock TABLE_1 for read on a
reserved RowHash to prevent global deadlock.
2) Next, we lock TABLE_1 for read.
3) We do an all-AMPs RETRIEVE step from
TABLE_1 by way of an all-rows scan
with no residual conditions into Spool 1 (group_amps), which is
redistributed by the hash code of (
TABLE_1.COL_1) to all AMPs.
Then we do a SORT to order Spool 1 by the sort key in spool field1
eliminating duplicate rows. The size of Spool 1 is estimated with
no confidence to be 163,068,864 rows (7,011,961,152 bytes). The
estimated time for this step is 27.72 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 27.72 seconds.

Explain 3:
Explain COLLECT STATISTICS
COLUMN ( COL_1 )
ON TABLE_1;

1) First, we lock TABLE_1 for access.
2) Next, we do an all-AMPs SUM step to aggregate from
TABLE_1 by way of an all-rows scan
with no residual conditions , grouping by field1 (
TABLE_1.COL_1). Aggregate
Intermediate Results are computed globally, then placed in Spool 3.
The size of Spool 3 is estimated with no confidence to be
163,068,864 rows (4,728,997,056 bytes). The estimated time for
this step is 41.80 seconds.
3) Then we save the UPDATED STATISTICS for ('COL_1 ') from
Spool 3 (Last Use) into Spool 5, which is built locally on a
single AMP derived from the hash of the table id.
4) We compute the table-level summary statistics from spool 5 and
save them into Spool 6, which is built locally on a single AMP
derived from the hash of the table id.
5) We lock StatsTbl for write on a RowHash.
6) We do a single-AMP ABORT test from StatsTbl by way of the
primary index "{LeftTable}.Field_2 = 'A906AB1D0000'XB" with a
residual condition of ("(StatsTbl.ExpressionList =
'COL_1 ') OR (StatsTbl.StatsId = 2)").
7) We do a single-AMP MERGE into StatsTbl from Spool 5 (Last Use).
8) We do a Single AMP MERGE Update to StatsTbl from Spool 6 (Last
Use) by way of a RowHash match scan.
9) We spoil the statistics cache for the table, view or query.
10) We spoil the parser's dictionary cache for the table.
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.

Explain 1 and 2 are almost identical
All three explain are about low confidence and all-rows scan
All three explains are showing sorting in plan ( Explain 1 and 2: Explicitly talking about sorting. Explain 3 is talking about Aggregation so sorting is implicit)

 

Teradata Employee

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

I assume these are without stats. Now show the ones with stats please. I believe that different plans will result from knowing about the data distribution in the grouping column.

 

In 1 and 2 the step that says scan and redistribute will definately cause a spool problem. Since every row has the same value, all of those values will be sent to one AMP creating a really skewed spool. Only the value from the group column will be spooled, not the whole row, but still all 220M values will be sent to a single AMP before the sort step that creates the distinct result.

Enthusiast

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

After collecting stats on COL_1 everything is working fine. 

 

What I am looking for, without stats following queries will go into spool space 

 Case1:

SELECT DISTINCT COL_1 FROM TABLE_1;

OR

SELECT COL_1 FROM TABLE_1 GROUP BY COL1;

 

But, without stats, following query will work fine. 

Case 2:

COLLECT STATISTICS COLUMN ( COL_1 ) ON TABLE_1;

 

In both the cases Full table scan, low confidence, sorting are there but only Stat collection will succeed. Why Inserts are failing and stat collection is not?

Following might be possibility:

1. Data Distribution and Hashing are coming into picture for Case 1 but not for Case 2.

 

But some more question, just for sake of curiosity.  :)

why same is not getting followed in case 1? Why Distribution and Hashing are getting into plan for case 1?

Teradata Employee

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

With the explains from the ones that work we could compare the way the queries are executed with and without information about the data.

The collect statistics statement uses a quite different mechanism to create the stats histogram - it is not just doing a distinct operation. The algorithm always builds an AMP local summary of the data then forwards to an AMP to create and store the final histogram. Because the summarization work is done local to each AMP, the spooling required is much more evenly distributed and the amount of data that needs to be sent between AMPs is much smaller.

To perform a DISTINCT operation all the rows with equal values must be on the same AMP to do the comparisons and remove the duplicates. A mechanism to do this is to redistribute the data on the hash of the field that is to be DISTINCT, then run a sort with duplicate elimination on the redistributed spool. In this case, that turns out to be a less than optimal choice since all rows have the same value so all of them get distributed to the same AMP resulting in a very skewed spool.

A look at the explain of the query with stats collected will show a quite different mechanism used which results in the duplicates being eliminated twice - once locally in the AMPs and then a redistribution of that result followed by duplicate elimination on the redistributed spool. For this case the first duplicate elimination pass reduces the data to one row per AMP and then only has to redistribute as many rows as there are AMPs and the second duplicate elimination pass is very small.

The first algorithm is best when the set is more unique because less duplicate elimination is done in the first pass wasting the resources on that pass. The second algorithm is better when the data has very few distinct values because the first pass significantly reduces the set. The optimizer decides which one to use based on the information it has available, with no information it has a default which turns out to not be optimal for this case. With information it knows which one to choose and gets the optimal solution.
Enthusiast

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

Explains are as follows. Both are using Aggregation algorithm.

 

EXPLAIN 
SELECT DISTINCT COL1_1 FROM TABLE_1;

1) First, we lock TABLE_1 for read on a
reserved RowHash (65535: 65535) to prevent global deadlock.
2) Next, we lock TABLE_1 for read.
3) We do an all-AMPs SUM step to aggregate from
TABLE_1 by way of an all-rows scan
with no residual conditions, and the grouping identifier in field
1. Aggregate Intermediate Results are computed globally, then
placed in Spool 1. The size of Spool 1 is estimated with high
confidence to be 1 row (29 bytes). The estimated time for this
step is 2.68 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 2.68 seconds.

EXPLAIN 
SELECT COL1_1 FROM TABLE_1 GROUP BY COL1_1;

1) First, we lock TABLE_1 for read on a
reserved RowHash (65535: 65535) to prevent global deadlock.
2) Next, we lock TABLE_1 for read.
3) We do an all-AMPs SUM step to aggregate from
TABLE_1 by way of an all-rows scan
with no residual conditions, and the grouping identifier in field
1. Aggregate Intermediate Results are computed globally, then
placed in Spool 1. The size of Spool 1 is estimated with high
confidence to be 1 row (29 bytes). The estimated time for this
step is 2.68 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 2.68 seconds.

 

So one more question:

Some where I read Aggregation follows ARSA. Is ARSA working as follow?

A- Aggregate Local

R- Redistribute

S- Sort Global

A- Aggregate Global

 

Sorry for making this trail long. Your comments are good and helpful

Teradata Employee

Re: By any chance is there any possibility of getting into spool space issue while collecting the st

Teradata avoids sorting if at all possible. In the most extreme case where there are an enormous number of groups then the method will resemble ARSA. But most cases will not have any sort. Along the way many optimizations are done based on knowledge of the data (as discussed above) so for instance whether the data will be aggregated locally is dependent on grouping field distribution and data demographics.

 

The grouping will first be performed locally in the AMPs to create a single row per group. This phase may be optimized out if the query optimizer can tell that very little (or no) reduction will be made in the number of result rows (eg if grouping is done on a group key that has a cardinality similar to the cardinality of the table).

 

Then each of the local aggregation rows will be hashed on the grouping fields and the local aggregation result will be redistributed to the AMP for that hash. Thus all of the rows for a particular group key will end up on the same AMP. The group keys as a whole will be spread across all the AMPs by the hashing on the group key. Each AMP will have to receive at most (number of AMPs) rows for each group key. This phase will be optimized out if the optimizer can see that the local aggregation can produce only 1 result on 1 AMP per group key (eg if the group key is or contains the PI).

 

Final aggregation is performed on the set of rows with a common group key by using appropriate operations (eg count is used for the local aggregation but SUM of the counts is used in the global aggregation).

 

The EXPLAIN gives clues to which phases are optimized out (eg global aggregation if the redistribution is performed and local aggregation if the redistribution is optimized out).

 

Both local and global aggregations are optimized with in-memory algorithms which cache the group key and aggregate computations. Local aggregation reads from a source and does the local aggregation in the local cache. The redistribution is done directly from the cache onto the bynet. As the rows arrive at the receiving AMP, the global aggregation is done in the global cache. This eliminates sorting large sets to do aggregation, writing spools and reading spools to get the process done. The only time a sort will be done is if the number of group keys will not fit in the global aggregate cache and has to be spilled from cache to spool. Then a final sort and aggregation pass will have to be done on the spill spool.