Improve performance on an analytical query , mulitple columns using same set of partitions

General
Enthusiast

Improve performance on an analytical query , mulitple columns using same set of partitions

Hi,

I have a large query where there are two window sets, one over year and another over quarter. I think that since teh parttion by and order by clause is same for 1 set of column and another set of columns, this query can be re-written to avoid the partioning on rows being done again and again. Below is the query:

 

Sel yrmthId,

acctId,

CustId,

CurrencyId,

PartyId,

MthId,

---------

---------

---------,

sum(abc1) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C1,

sum(abc2) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C2,

sum(abc3) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C3,

sum(abc4) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C4,

-----------

sum(abc10) over (partition by yrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C10,

------------------

sum(abc1) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C11,

sum(abc2) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C21,

sum(abc3) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C31,

sum(abc4) over (partition by yrId, QtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C41,

-----------

sum(abc10) over (partition by yrId,qtrId, AcctId, CustId, CurType, PtyTpe order by mthId) as C10

from <volatile table >

join calendar table

How can we rwrite this query to avoid the partition being done again. As I udnertsand that the optimizer must have to repeat this step extensively.any suggestions?

Thanks in advance!

6 REPLIES
Teradata Employee

Re: Improve performance on an analytical query , mulitple columns using same set of partitions

Can you provide explain please?
Enthusiast

Re: Improve performance on an analytical query , mulitple columns using same set of partitions

And what would a simple sum/group by do? I sometimes think that an analytical fuction is executed seperately. It is cool stuff, sometimes performs like crazy. But in some case it doesn't.

Enthusiast

Re: Improve performance on an analytical query , mulitple columns using same set of partitions

We do an all-AMPs STAT FUNCTION step from Spool 2 (Last Use) by
     way of an all-rows scan into Spool 6 (Last Use), which is assumed
     to be redistributed by value to all AMPs.  The result rows are put
     into Spool 4 (all_amps) (compressed columns allowed), which is
     built locally on the AMPs.  The size is estimated with no
     confidence to be 2 rows (562 bytes).
  5) We do an all-AMPs STAT FUNCTION step from Spool 4 (Last Use) by
     way of an all-rows scan into Spool 9 (Last Use), which is assumed
     to be redistributed by value to all AMPs.  The result rows are put
     into Spool 1 (group_amps), which is built locally on the AMPs.
     The size is estimated with no confidence to be 2 rows (690 bytes).

Because of security reasons and lack of access I cannot pull the actual explain but in the DEv env with no data, it seems like the optimizer is using STAT function twice with an all AMP operation. I know in production this step takes close to 1 hour along with insert into target.

Teradata Employee

Re: Improve performance on an analytical query , mulitple columns using same set of partitions

As you can see, optimizer combines multiple ordered analytic functions with the same partition and order into a single execution.   Two different ones are required because there are two different partition lists.

I can't comment on the time without data on size of the source data and more details on what steps take the time. If the source data is very large, then this query will be expensive because it will create a result as large as the source which has to be sorted and computed upon.

Enthusiast

Re: Improve performance on an analytical query , mulitple columns using same set of partitions

Whats the volume of the data in the table and what is the join type

Junior Contributor

Re: Improve performance on an analytical query , mulitple columns using same set of partitions

This is the best plan you can get, the optimizer combines all OLAP steps with the same partition/order.

Standard SQL syntax supports a WINDOW clause (not supported in Teradata) which avoids repeating the window definition, but it's just syntactic sugar.