Could you try to build the following volatile table, just to see if it aborts before the end ?
create multiset volatile table mvt_agg_out, no log AS ( select 201811 as Month_ID , Basic_Call_Type_ID , Service_Type_Id , Total_Type_ID , MSISDN , Cell_ID , Location_Area_ID , CallIndicator , sum(TOTAL_DURATION) as TOTAL_DURATION , sum(TOTAL_CALLS) as TOTAL_CALLS , sum(TOTAL_GROSS) as TOTAL_GROSS , count(MSISDN) as NO_OF_TRANSACTIONS from AGG_OUT_TR_MONTHLY_N where Month_ID between To_Number(To_Char(Add_Months(To_Date('201811', 'yyyymm'), - 2), 'yyyymm')) and To_Number('201811') and Service_Type_Id not in ('51 ','61 ','21 ') and CallIndicator not in ('SHD','SHP','SLF') and Basic_Call_Type_ID in (1,3,4,5,6,7) group by Basic_Call_Type_ID , Service_Type_Id , Total_Type_ID , MSISDN , Cell_ID , Location_Area_ID , CallIndicator ) with data unique primary index (Month_ID, Basic_Call_Type_ID, Service_Type_Id, Total_Type_ID, MSISDN, Cell_ID, Location_Area_ID, CallIndicator) on commit preserve rows;
Explain still shows no confidence and Merge time more then 4 hours.
Don't trust explain especially on merge time, I've noticed in multiple occasion that the time projection is way higher than actual time.
You can trust dbc.DBQLogTbl and dbc.QryLogStepsV - the later one needs to be enabled before execution.
There's another reason why spool usage is high, some of you Group By columns are oversized, e.g. an MSISDN has a maximum length of 15, but is defined as VarChar(100). Any column used in Group/Partition/Order By is expanded to the defined size (I know it's stupid), thus adding 85 bytes per row. Of course, modifying the data model is not easy, but when you know about the actual max size you can use cast(msisdn as varchar(15)) to decrease spool.
It is really surprsing if this is the case with group by that it use max. what will be the benefit if having varhcar or size flexibility then ... :( but many thanks for sharing it.
what would be the benefit of having PI you mentioned for VT. Plus i need to have PI different from this as mentioned earlier to handle next join.
I wanted this to focus on the group by only, avoiding any further redistribution.
I may have omited MonthID, as it's a constant not sure if Teradata will redistribute or not in the VT.
1h10 is quite long, what are the before after aggregation volumes ?
You have to try with the shorter columns as suggested by @dnoeth to see the difference.