Spool Space Issues

Database

Spool Space Issues

I am getting Spool space error when this query run in PROD. Any optimization tips welcome. I have put the records in Millions and Indexes on comments.

INSERT INTO extrt_ent_drcty_data_3_gt(28 Cols)

SELECT d.cols,pe.cols,bn.cols

FROM extrt_ent_drcty_data_2_gt d /* 15Million rows - acct_id,srv_accs_id(Primary) */

Left join ( SELECT a1.acct_id, a1.acct_nm_addr_eff_dt, a1.acct_nm_addr_end_dt,ea.elec_addr_txt
FROM acct_nm_addr_hist a1 /* 540M - acct_id,addr_id,prty_nm_id(Primary) */

join elec_addr ea /* 33M - addr_id(Unique PI) */
on a1.addr_id = ea.addr_id
AND a1.nm_addr_rol_cd = 'PE'
AND a1.acct_nm_addr_end_dt = '9999-12-31') pe

ON d.acct_id = pe.acct_id

Left join ( SELECT a2.cols,la.cols,nm.cols

FROM acct_nm_addr_hist a2 /* 540M - acct_id,addr_id,prty_nm_id(Primary) */
join loc_addr la /* 317M - addr_id (UPI) */
on a2.addr_id = la.addr_id

Left join prty_nm nm /* 218M - prty_nm_id (UPI) */

on a2.prty_nm_id = nm.prty_nm_id
WHERE a2.nm_addr_rol_cd = 'BN'
AND a2.acct_nm_addr_end_dt = '9999-12-31'
AND a2.acct_nm_addr_eff_dt = (
SELECT max(anah_mx.acct_nm_addr_eff_dt)
FROM acct_nm_addr_hist anah_mx /* 540M - acct_id,addr_id,prty_nm_id (Primary) */
WHERE a2.acct_id = anah_mx.acct_id
AND anah_mx.nm_addr_rol_cd = 'BN'
AND anah_mx.acct_nm_addr_end_dt = '9999-12-31')) bn
ON d.acct_id = bn.acct_id
AND bn.acct_nm_addr_end_dt = '9999-12-31';
6 REPLIES

Re: Spool Space Issues

do an explain on this query with
diagnostic helpstats on for session;
This will show if there are any statistics that should be collected.
collect the statistics and then run your query.
that will help.

Re: Spool Space Issues

I just ran into a similar problem the other day. It had to do with the fact that Teradata's plan was left outer joining to my dimension table on the step where I still had millions of rows, rather than at the last step where I only had 200,000 rows.

You can force it to left outer join to the dimension tables in the correct order with inline views.

select out1.col1, out1.col2, d.col3
from (select col1, col2 from inner_query) out1
left outer join dim_table d
on out1.col1 = d.col1

If you do this, it fully renders the inner query first before attempting to hash or merge join millions of records in your spool area to a dimesion table, where it really only needs to look up dimension values to go with your final results.

Hopefully I explained the concept well enough. Basically, take all of the heavy processing and make it an inline view, and the outer join to your dimension tables in a second step. The only way this becomes impossible is if you need one of the left outer join values to group by in the inner view.
N/A

Re: Spool Space Issues

As Nytewynd already proposed, try to filter on the 540M table in bn before the join to reduce the number of rows to be joined.
Replacing the MAX-subquery with a RANK will probably help also.

Of course untested:

INSERT INTO extrt_ent_drcty_data_3_gt(28 Cols)
SELECT d.cols,pe.cols,bn.cols
FROM extrt_ent_drcty_data_2_gt d /* 15Million rows - acct_id,srv_accs_id(Primary) */
Left join
(
SELECT a1.acct_id, a1.acct_nm_addr_eff_dt, a1.acct_nm_addr_end_dt,ea.elec_addr_txt
FROM acct_nm_addr_hist a1 /* 540M - acct_id,addr_id,prty_nm_id(Primary) */
join elec_addr ea /* 33M - addr_id(Unique PI) */
on a1.addr_id = ea.addr_id
WHERE a1.nm_addr_rol_cd = 'PE'
AND a1.acct_nm_addr_end_dt = '9999-12-31'
) pe
ON d.acct_id = pe.acct_id
Left join
(
SELECT a2.cols,la.cols,nm.cols
FROM
(
SELECT ...
FROM acct_nm_addr_hist
WHERE a2.nm_addr_rol_cd = 'BN'
AND a2.acct_nm_addr_end_dt = '9999-12-31'
QUALIFY
RANK() OVER (PARTITION BY acct_id ORDER BY acct_nm_addr_eff_dt DESC) = 1
) a2 /* 540M - acct_id,addr_id,prty_nm_id(Primary) */
join loc_addr la /* 317M - addr_id (UPI) */
on a2.addr_id = la.addr_id
Left join prty_nm nm /* 218M - prty_nm_id (UPI) */
on a2.prty_nm_id = nm.prty_nm_id
) bn
ON d.acct_id = bn.acct_id

And be shure there are no missing stats...

And a better choice of (matching) PIs would definitly help, but you probably can't change them :-)

Dieter

Re: Spool Space Issues

I often used to got spool space issues, I realized that I was not joining the columns that are of same data types (most of the cases are like joining a decimal data type to integer data type etc., so I casted the joining column to match the other column's data type, it used to resolve the issue), which used to pull me in to Spool space issues. I suggest check those kind of issues. Some one told me that when we filter the subqueries just to have the rows that joins, the subquery table's Index will be lost for query computing??? I am not sure).
N/A

Re: Spool Space Issues

WHAT IS THE DEFFERENCE B/W ETL DEVELOPER AND TERADATA DEVELOPER? TELL ME PLZ KNOWS ANYBODY?

N/A

Re: Spool Space Issues

WHAT DIFFERENCE B/W ETL DEVELOPER AND TERADATA DEVELOPER?