Spool and performance

Database
Teradata Employee

Re: Spool and performance

Could you try to build the following volatile table, just to see if it aborts before the end ?

create multiset volatile table mvt_agg_out, no log AS
(
  select 201811              as Month_ID
       , Basic_Call_Type_ID
       , Service_Type_Id
       , Total_Type_ID
       , MSISDN
       , Cell_ID
       , Location_Area_ID
       , CallIndicator
       , sum(TOTAL_DURATION) as TOTAL_DURATION
       , sum(TOTAL_CALLS)    as TOTAL_CALLS
       , sum(TOTAL_GROSS)    as TOTAL_GROSS
       , count(MSISDN)       as NO_OF_TRANSACTIONS
    from AGG_OUT_TR_MONTHLY_N
   where Month_ID       between To_Number(To_Char(Add_Months(To_Date('201811', 'yyyymm'), - 2), 'yyyymm')) and To_Number('201811')
     and Service_Type_Id not in ('51        ','61        ','21        ')
     and CallIndicator   not in ('SHD','SHP','SLF')
     and Basic_Call_Type_ID  in (1,3,4,5,6,7)
group by Basic_Call_Type_ID
       , Service_Type_Id
       , Total_Type_ID
       , MSISDN
       , Cell_ID
       , Location_Area_ID
       , CallIndicator
)
with data
unique primary index (Month_ID, Basic_Call_Type_ID, Service_Type_Id, Total_Type_ID, MSISDN, Cell_ID, Location_Area_ID, CallIndicator)
on commit preserve rows;
Teradata Employee

Re: Spool and performance


 Explain still shows no confidence and Merge time more then 4 hours.

Don't trust explain especially on merge time, I've noticed in multiple occasion that the time projection is way higher than actual time.

You can trust dbc.DBQLogTbl and dbc.QryLogStepsV - the later one needs to be enabled before execution.

 

Teradata Employee

Re: Spool and performance

Ok, please give me some time. I will update tomorrow hopefully.
Junior Contributor

Re: Spool and performance

There's another reason why spool usage is high, some of you Group By columns are oversized, e.g. an MSISDN has a maximum length of 15, but is defined as VarChar(100). Any column used in Group/Partition/Order By is expanded to the defined size (I know it's stupid), thus adding 85 bytes per row. Of course, modifying the data model is not easy, but when you know about the actual max size you can use cast(msisdn as varchar(15)) to decrease spool.

Teradata Employee

Re: Spool and performance

It is really surprsing if this is the case with group by that it use max. what will be the benefit if having varhcar or size flexibility then ... :( but many thanks for sharing it.

Teradata Employee

Re: Spool and performance

Total time of this is still more then 4 hours. running it. Will update in a while.

Teradata Employee

Re: Spool and performance

Hi 

This insert has been performed in 1 hour 10 minutes and spool was 550 GB.

Big difference from explain

regards

Hina

Teradata Employee

Re: Spool and performance

what would be the benefit of having PI you mentioned for VT. Plus i need to have PI different from this as mentioned earlier to handle next join.

Teradata Employee

Re: Spool and performance

I wanted this to focus on the group by only, avoiding any further redistribution.

I may have omited MonthID, as it's a constant not sure if Teradata will redistribute or not in the VT.

 

1h10 is quite long, what are the before after aggregation volumes ?

You have to try with the shorter columns as suggested by @dnoeth to see the difference.