Help on a query for Aggregation

Database

Help on a query for Aggregation

Hi,
I have a table which has about 25 million rows. I am trying to materialize the table by below query:
select
Act_dt
,product_id
,subscription_id
,sum(download_volume/1024)
,sum(upload_volume/1024)
,sum((download_volume+upload_volume)/1024)
,avg(upload_volume/1024)
,avg(download_volume/1024)
from XX_DB.Usage
group by
1,2,3
The explain plan for this query is as below:
Explanation
1) First, we lock XX_DB.Usage for access.
2) Next, we do an all-AMPs SUM step to aggregate from
XX_DB.Usage by way of an all-rows scan with no
residual conditions, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 3. The input table will not be cached in memory, but it
is eligible for synchronized scanning. The aggregate spool file
will not be cached in memory. The size of Spool 3 is estimated
with no confidence to be 23,140,868 rows. The estimated time for
this step is 1 minute and 33 seconds.
3) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The result spool file will not be cached in memory.
The size of Spool 1 is estimated with no confidence to be
23,140,868 rows. The estimated time for this step is 9.40 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 1 minute and 43 seconds.

But when I try to create a table from the above query like

Create table local_user.Table_Usg
as
(Query from above) with data
the query doesn't complete even after 20 minutes.

Can anyone help me on this.The version of Teradata I am using is V2R6.
2 REPLIES
Senior Apprentice

Re: Help on a query for Aggregation

Hi Manjeeth,
you probably didn't specify the PI, thus it defaults to a NUPI on the first column Act_dt.
And you didn't specify MULTISET, so it defaults to SET.

Lots of rows with the same value + SET table -> a huge amount of duplicate row checks.

Choose a better PI and it will finish within the expected time frame.

Dieter

Re: Help on a query for Aggregation

Hi Dieter
Thanks for the i/ps. I tried changing the primary key to a different field Subscription_Id and the table is created in less than 1 minute.