Spool space usage

Database
Enthusiast

Spool space usage

Apologies if this post is duplicative, but i haven't found a comprehensive answer yet

 

I want to determine the spool space used across all amps for any ad hoc query, from info in DBQL (assume its recent and hasn't been flushed yet).   Looking at diskspace:

 

UPDATE  dbc.databasespace
SET PeakPermSpace = 0, PeakSpoolSpace = 0, PeakTempSpace = 0
where databaseid in (sel distinct ds.databaseid 
                     from   dbc.databasespace ds 
                     inner join dbc.dbase d on (ds.databaseid =d.databaseid )
                     where d.databasename = 'ETL_LOAD');

<run my query as ETL_LOAD>

Select   sum(MaxSpool)/(1024*1024*1024) (Number(14,2)) as SumMaxspoolGB,sum(CurrentSpool)/(1024*1024*1024) (Number(14,2)) as SumCurrentSpoolGB, sum(PeakSpool)/(1024*1024*1024) (Number(14,2)) as SumPeakSpoolGB
from dbc.diskspace
where databasename='ETL_LOAD';

SumMaxspoolGB, SumCurrentSpoolGB, SumPeakSpoolGB
51200,24.05,1829.55

gives SumPeakSpoolGB = 1829  which is reasonable for this query

 

I want to be able to determine this without resetting the peak spool before each query

Looking at a bunch of DBQL tables i get a few different results, i think i'm querying these incorrectly

select cast(sum(spoolusage) as decimal(18,0)) a from dbc.DBQLStepTbl  where queryid = 307053425692255873;
7649000247296
select cast(spoolusage as decimal(18,0))
from dbc.DBQLogTbl 
where sessionid = 869896
and queryid = cast('307053425692255873' as decimal(18,0));
1180135591936
select cast(sum(qls.spoolusage) as decimal(18,0))  ,cast(max(qls.maxampspool) as decimal(18,0))   ,min(qls.minampspool)
from dbc.QryLogStepsV qls
where queryid = cast('307053425692255873' as decimal(18,0))
7649000247296,5294239744,0.0

What am i doing wrong to get 3 different results of 1.2TB, 1.9TB and 7.6TB?

(max spool for this user is 50TB, we're trying to identify the worse culprits to reduce spool usage)

 

Thanks for any help

 

 

 


Accepted Solutions
Teradata Employee

Re: Spool space usage

Strike "may underestimate" from my previous post and make that "almost certainly understates" unless you use Collection Algorithm 3.

 

The "end of step" value is also after any "last use" spools from prior steps have been released. This results in the SpoolUsage often being significantly less than PeakSpool.

For example, if the first step is a Retrieve that creates a 3GB spool, the second step is a Retrieve that creates a 2GB spool, the third joins the first two spools to produce a 4GB result (and this is the last use for both input spools), DBQL would show 3GB after the first step, 5GB after the second, and 4GB after the third. But the actual peak during step 3 would be 9GB while both input and output spools were present.

1 ACCEPTED SOLUTION
5 REPLIES
Teradata Employee

Re: Spool space usage

Spool space is released when the last step needing each spool completes. You might take the Max from StepTbl but not the Sum.

 

As far as 1.2 vs 1.9, are you certain that only one query was run by this username between the reset and querying the peak value?

 

Enthusiast

Re: Spool space usage

Thanks for the response Fred

 

That makes sense getting the max instead of the sum, i just need the max used at any point.  So when i select Max i do get consistent results of 1.2TB from all of these:

DBQLogTbl.spoolusage
DBQLStepTbl.spoolusage
QryLogStepsV.spoolusage

 

However, i still get a different result of 1.8TB after resetting and querying peak value (definitely the only query running as ETL_LOAD)

Select   sum(MaxSpool)/(1024*1024*1024) (Number(14,2)) as SumMaxspoolGB
        ,sum(CurrentSpool)/(1024*1024*1024) (Number(14,2)) as SumCurrentSpoolGB
        ,sum(PeakSpool)/(1024*1024*1024) (Number(14,2)) as SumPeakSpoolGB
from dbc.diskspace
where databasename='ETL_LOAD';
--51200,24.05,1829.55

Would you expect these to be equal?  I just want to be certain i'm not underestimating usage if i go off the DBQL views

 

Thanks a lot

Teradata Employee

Re: Spool space usage

Not necessarily. If you are using DBQL Collection Algorithm 3 then SpoolUsage should hold the peak value during the step, and for a query run in isolation ought to match space accounting PeakSpool. Otherwise, SpoolUsage in StepTbl is the value at the end of the step which might be significantly less - for example, intermediate "spill to disk" files used for sorting would not be included. LogTbl should match the max of the StepTbl values (regardless of collection algorithm).

 

So using DBQL SpoolUsage may underestimate true PeakSpool (unless you change Collection Algorithm, either the default in dbscontrol or override in all relevant DBQL rules). But even with the old collection algorithm, SpoolUsage values may be good enough as a basis for relative ranking to identify candidates for improvement.

Teradata Employee

Re: Spool space usage

Strike "may underestimate" from my previous post and make that "almost certainly understates" unless you use Collection Algorithm 3.

 

The "end of step" value is also after any "last use" spools from prior steps have been released. This results in the SpoolUsage often being significantly less than PeakSpool.

For example, if the first step is a Retrieve that creates a 3GB spool, the second step is a Retrieve that creates a 2GB spool, the third joins the first two spools to produce a 4GB result (and this is the last use for both input spools), DBQL would show 3GB after the first step, 5GB after the second, and 4GB after the third. But the actual peak during step 3 would be 9GB while both input and output spools were present.

Enthusiast

Re: Spool space usage

Thanks Fred, thats really useful

 

i'm pretty sure we're set to the default #1, I'll ask DBAs to change it and retry