Spool space error concern

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Enthusiast

Spool space error concern

If I have 9.2 gb of spool space and I have 3 tables which tables data are well distributed and skewness is almost nill. Size of the
tables are like table_A, table_b and table_c all are - 3gb. Total 9 gb

scenario - 1
------------

If all these 3 tables are getting queried with sel * from db.table
then will it raise a spool space issue ??


Accepted Solutions
Senior Apprentice

Re: Spool space error concern

Hi,

 

In my experience you'll rarely find what I would call a pure star schema being used. The customer may say they are using a star schema, but there are often one or two 'snow flakes' (if you get my meaning) in the data model.

 

Cheers,

Dave

 

P.S. Can I suggest that if you want to ask a question on a different subject that you start a different topic.

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
1 ACCEPTED SOLUTION
8 REPLIES 8
Highlighted
Senior Apprentice

Re: Spool space error concern

Hi,

It really depends on the details of your query(s).

 

In theory, if you run the following as an MSR:

sel * from table_a
;sel * from table_b
;sel * from table_c;

...then you should be ok.

 

If another query is running under the same username at the same time, then maybe not.

 

If you run a query which joins two or more of those tables then "it depends". If one gets 'duplicated on all amps' then I'd guess (and it is only a guess) probably not.

 

If that doesn't help, can you provide some more detail of the likely queries?

 

HTH

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Enthusiast

Re: Spool space error concern

Hi Dave,

 

Thank you very much for your comments. but I would like to push two scenario. Please help me to understand these two.

 

scenario - 1
------------
table_1 (col_a, col_b) where col_a primary index
table_2 (col_c, col_d) where col_c primary index
table_3 (col_e, col_f) where col_e primary index

 

now if I am joining on top of primary index and selecting all the records

 

like table_1 inner join table_2 on (col_a and col_c)
table_2 inner join table_3 on (col_c and col_e)

 

scenario - 2
------------
table_1 (col_a, col_b) where col_a primary index
table_2 (col_c, col_d) where col_c primary index
table_3 (col_e, col_f) where col_e primary index


now if I am joining on top of non primary index column and selecting all the records

 

like table_1 cross join table_2 where (col_b and col_d)
table_2 cross join table_3 where (col_b and col_f)

 

please consider the above spool space scenario. like I have 9.2 gb spool space and all these three tables are like 3 gb, 3 gb, 3gb total 9gb. then for these two scenario whether spool space issue will come ??

 

NOTE : Please consider for the user and for the specific session there is no other queries are running

Senior Apprentice

Re: Spool space error concern

Hi,

It's difficult to say but here are some thoughts.

 

scenario#1 - PI:PI joins

- you need to look at the join cardinality of the PI values

- if a row in table_1 joins to 5 rows in table_2 then the one table_1 row will be in the output spool file 5 times (you'll quite possibly end up running out of spool)

- if the PI columns are not compatible (in terms hashing) and there is redistribution and/or duplication then again you're more likely to run out of spool

 

scenario#2 - cross join

- this is still an inner join. When joining on non-PI columns you will have data movement (redistribution and/or duplication) which is more likely to cause spool space issues, especially if the join column has skewed values and no stats.

 

A question for you. What is the issue that you are facing? Are you having spool space issues running such queries?

 

Cheers,

Dave

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Enthusiast

Re: Spool space error concern

Hi Dave,

Thank you very much for your explanation.

Well I am just building my concept clear about spool space issues. I have not faced any such issues.

Can you please help me in another query.
I have clear theoretical knowledge about star and snowflakes schema. But in the projects I am working in my organisation, there I am unable to understand which schema my client is following. So can you please suggest me how to get it co-related.
Senior Apprentice

Re: Spool space error concern

Hi,

 

In my experience you'll rarely find what I would call a pure star schema being used. The customer may say they are using a star schema, but there are often one or two 'snow flakes' (if you get my meaning) in the data model.

 

Cheers,

Dave

 

P.S. Can I suggest that if you want to ask a question on a different subject that you start a different topic.

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Enthusiast

Re: Spool space error concern

Thanks Dave. Yes sure I will start a new topic. It was my first time so posted in that query.


@DaveWellman wrote:

Hi,

 

In my experience you'll rarely find what I would call a pure star schema being used. The customer may say they are using a star schema, but there are often one or two 'snow flakes' (if you get my meaning) in the data model.

 

Cheers,

Dave

 

P.S. Can I suggest that if you want to ask a question on a different subject that you start a different topic.

 


 

Teradata Employee

Re: Spool space error concern

A few additional comments re the spool questions:

- EXPLAIN is a friend. Look at the explain for the query and see what is spooled when. That can give a good idea of how much spool will be required. Remember that spools used as input have to be around at the same time as the output spool. When (last use) is on a spool in the explain, then the spool will be deleted at the end of that step, the spool needs to be counted until the end of that step.

- Result spools are generally larger than internal spools. The result spool is in the target data types for returning to the user. That means that if the space is very close based on the table size, it may still be out of spool when it creates the result.

- Some operations require special spools. Especially set operators (UNION, INTERSECT, MINUS), and sort (ORDER BY). These require a special sort spool that makes the data types larger, sometimes significantly larger.

 

IM(NS)HO: It is counter productive to manage a system that tightly on spool. Having a limit that will help stop runaway queries is a good thing. But micro managing spool to very tight tolerances causes wasted resources by forcing users to run more quieries over parts of the data and by spending resources to execute queries which abort because of a shortage of spool. Especially at small numbers of GB, it is much better to give users enough spool space to run reasonable queries.

Enthusiast

Re: Spool space error concern

Thanks todd. It was a really nice explanation. Hope to have more of your inputs in the future discussions.