I have an Aggregate Join Index defined on a table. Only Sums.
The table is empty. The AJI has the same Index as the basetable. The Index is almost unique considering the records I want to laod. I also defined the records as not null, nevertheless the explain shows that the optimizer is not aware of this fact.
I want to load 40 000 000 records well distributed. The load of the base table takes 5 second.
The Update of the AJI last forever. The grouping of the AJI is rather granular, so it would be populated with also 40 000 000 records.
Is it like it is or is there a possiblity to change this? Is AJI Updating record by record? This would maybe explain this long running?
I realise that you already have the AJI defined, but could you share an Explain of the 'create AJI'?
You said that the AJI has the same index as the base table, I assume you're talking about the Primary Index.
Probably more importantly, is the Primary Index of the AJI (and table) included within the GROUP BY columns? And what are the data types of the GROUP BY columns? Big VARCHARs will typically slow down aggregation processing.
might this help?
Best Regards Martin
Format:HTML Format Version:1.0 StartHTML: 165 EndHTML: 9653 StartFragment: 314 EndFragment: 9621 StartSelection: 314 EndSelection: 314SyntaxEditor Code Snippet
CREATE MULTISET TABLE f_lew_bil.t_agg_deliv_base2 ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO ( client_id SMALLINT not null, item_sid INTEGER not null, dat_formdate DATE FORMAT 'YY/MM/DD' not null, loc_sid_store INTEGER not null, src_wh_id INTEGER, deliverytype_id INTEGER not null, orig_store_nr INTEGER not null, cy_month INTEGER not null, sig_nr VARCHAR(20) CHARACTER SET UNICODE CASESPECIFIC not null, foodaction_fg BYTEINT not null, qty_total_del DECIMAL(18,4), val_rtl_rl_del DECIMAL(18,4), val_rtl_rl_net_del DECIMAL(19,5), qty_piece_del_agg DECIMAL(18,4), val_piece_del DECIMAL(18,4), val_piece_net_del DECIMAL(19,5), val_pr_diff_del DECIMAL(18,8), val_pr_diff_net_del DECIMAL(38,10), src_fg BYTEINT)PRIMARY INDEX ( client_id ,item_sid ,loc_sid_store,dat_formdate )
Format:HTML Format Version:1.0 StartHTML: 165 EndHTML: 7633 StartFragment: 314 EndFragment: 7601 StartSelection: 314 EndSelection: 314SyntaxEditor Code Snippet
CREATE JOIN INDEX f_lew_bil.i_aji_deliv_store_day_item2 AS SELECT a.client_id ,a.loc_sid_store ,a.item_sid ,a.dat_formdate ,a.cy_month ,a.deliverytype_id ,a.orig_store_nr ,a.foodaction_fg ,SUM(a.qty_total_del) AS qty_total_del ,SUM(a.val_rtl_rl_del) AS val_rtl_rl_del ,SUM(a.val_rtl_rl_net_del) AS val_rtl_rl_net_del ,SUM(a.qty_piece_del_agg) AS qty_piece_del ,SUM(a.val_piece_del) AS val_piece_del ,SUM(a.val_piece_net_del) AS val_piece_net_del ,SUM(a.val_pr_diff_del) AS val_pr_diff_del ,SUM(a.val_pr_diff_net_del) AS val_pr_diff_net_del FROM f_lew_bil.t_agg_deliv_base2 a GROUP BY 1,2,3,4,5,6,7,8 PRIMARY INDEX(client_id, item_sid, loc_sid_store,dat_formdate)
Explain INSERT INTO f_lew_bil.t_agg_deliv_base2
the AMPs. The size of Spool 27 is estimated with no confidence to
be 17,980,076 rows (2,049,728,664 bytes). The estimated time for
this step is 4.22 seconds.
22) We do an all-AMPs JOIN step from Spool 26 (Last Use) by way of an
all-rows scan, which is joined to Spool 27 (Last Use) by way of an
all-rows scan. Spool 26 and Spool 27 are joined using a single
partition hash join, with a join condition of ("(loc_sid =
st_loc_sid) AND (client_id = client_id)"). The result goes into
Spool 13 (all_amps) (compressed columns allowed), which is
redistributed by hash code to all AMPs to all AMPs. Then we do a
SORT to order Spool 13 by row hash. The size of Spool 13 is
estimated with no confidence to be 17,980,076 rows (2,661,051,248
bytes). The estimated time for this step is 13.00 seconds.
23) We do an all-AMPs MERGE into f_lew_bil.t_agg_deliv_base2 from
Spool 13. The size is estimated with no confidence to be
17,980,076 rows. The estimated time for this step is 2 minutes
and 9 seconds.
24) We do an all-AMPs SUM step to aggregate from Spool 13 (Last Use)
by way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 31. The size of Spool 31 is estimated with no confidence
to be 13,485,057 rows (2,589,130,944 bytes). The estimated time
for this step is 51.86 seconds.
25) We do an all-AMPs RETRIEVE step from Spool 31 (Last Use) by way of
an all-rows scan into Spool 29 (all_amps) (compressed columns
allowed), which is redistributed by hash code to all AMPs to all
AMPs. Then we do a SORT to order Spool 29 by row hash. The size
of Spool 29 is estimated with no confidence to be 13,485,057 rows
(2,373,370,032 bytes). The estimated time for this step is 11.16
26) We do an all-AMPs UPDATE from
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2 by way of Spool 29 (Last
Use) with a residual condition of (
"(((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1033 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1033 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1033 )) AND
((((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1032 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1032 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1032 )) AND
((((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1031 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1031 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1031 )) AND
((((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1030 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1030 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1030 )) AND
((((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1029 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1029 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1029 )) AND
((((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1028 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1028 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1028 )) AND
((((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1027 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1027 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1027 )) AND
(((F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1026 IS NULL) AND
(F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1026 IS NULL )) OR
F_LEW_BIL.I_AJI_DELIV_STORE_DAY_ITEM2.Field_1026 ))))))))"). The
size is estimated with no confidence to be 13,485,057 rows. The
estimated time for this step is 1 minute and 36 seconds.
27) We spoil the parser's dictionary cache for the table.
28) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
Thanks. So no obvious problems with data types or indexing.
One possibility comes to mind. Is it feasible within your processing to drop/create the AJI each time you populate the base table? I think this would avoid transient journaling on the 'AJI build' part of the processing.
In your original post you asked "Is AJI Updating record by record?" - No.In your explain plan it looks like step#26 is where the AJI is updated, i.e. a single step.
What release of TD are you on?
as you mentioned whether all records of the PI are included in the grouping columns, I took all grouping columns of the AJI as PI columns of the AJI and then it worked. :-)
I'm on 15.10.
Nevertheless it's a strange behaviour. Maybe Teradata is doing a product join if "to less columns" out of the partioning columns are part of the PI??
No good explanation. But with all columns it work. Thanks for your help!!! I will be here tomorrow (Austrian time :-))
Well, if it's working for you then that is good.
I don't understand why making the PI = GROUP BY columns should speed this up. Normally it is good enough to have the PI columns as part of the GROUP BY columns.
The only thing that immediately comes to mind is in your original definition the number of rows per PI data value. This might bring up a need for 'duplicate row checking', but (A) I don't know if that happens for AJI's (in principle it doesn't for AJI's but I don't know) and (B) from your original post the granularity of rows in the AJi is about the same as for the base table - and that builds in 5 seconds.
(Another) one of life's mysteries...
Hi Martin,Thanks for the info. Yes, to know why this difference exists would be interesting.