Let's talk about Aggregate Join Indexes (AJI)

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.

It's about time to start considering the use of Aggregate Join Indexes (AJI) instead of aggregate summary tables and/or aggregate views.

Teradata 13 contains new cool enhancements on the Teradata Optimizer to improve the utilization of the AJI's as listed below. But I have been using successfully AJI's with great results on V2R6 and Teradata 12 too.

  1. Cost-based query rewrite using Aggregate Join Index
    • Teradata Optimizer will determine, on a cost basis, whether a query will be rewritten to leverage existing AJI's in its query planning automatically.
  2. Use AJI's for internal processing of aggregates
    • Teradata Optimizer will automatically consider and leverage AJI's in the course of internally processing SQL Distinct and Extended Grouping aggregate operations.
  3. Use AJI's for queries with subquery and spooled derived tables/views
    • Teradata Optimizer will automatically consider and leverage AJI's to cover queries which contain a subquery and/or a spooled derived table.
  4. Use AJI's to partially cover an outer join query
    • Teradata Optimizer will automatically consider and leverage AJI's to partially cover outer join queries to be applied on either the outer tables in an outer join query or the inner tables in an outer join query.
  5. Partial Group By support for Join/Aggregate Join Indexes
    • Teradata Optimizer will automatically consider Partial/Early Group By to reduce the number of rows as early as possible in query execution plans which contain Aggregate Join Indexes in addition to base tables.
  6. Preserve Column Compression in a Join Index
    • The column compression information will be implicitly carried over from the base table columns to the Join Index.

Let's talk about all the steps that need to be considered during an AJI implementation.

  1. Reporting Layer
  2. Referential Integrity
  3. Single and/or Multi-Level AJI
  4. Load Process - Important Considerations
  5. Study Case - Retail Industry

Reporting Layer

  • Consider creating a report layer containing tables and views that will be associated with the new AJI's.
  • A new report layer will provide flexibility and will avoid changes on the existing base tables and associated load processes.
  • A new report layer will allow referential integrity to be added to the tables associated with the new AJI's more easily.
  • Consider creating a dimensional model containing hierarchy and fat tables for a multi-level AJI implementation.

Referential Integrity

  • Need to be defined on all tables associated with the new AJI's.
  • Best practices recommends to define all RI’s with “REFERENCES WITH NO CHECK OPTION” to avoid overhead during load process.
  • Load process will need to have an extra step to enforce the RI's on all tables associated with the new AJI’s.
  • The Teradata Optimizer uses the RI's to optimize the join plans.
  • The parent key columns must be either a unique primary index (UPI) or a unique secondary index (USI).
  • The foreign and parent keys must have the same number of columns and their data types must match.

Single and/or Multi-Level AJI

  • AJI's need to be built based on report requirements to cover a set of reports.
  • Start simple with a single table AJI on top of the FAT table associated with the report requirements.
  • The Optimizer will use the single level AJI to build a Multi-Level AJI associated with the same FAT table.
  • Start simple with no more than three the levels of multi-level AJI’s (lower, medium and high level).
  • The Optimizer will use the lower level AJI to build a next level AJI.
  • Define all necessary columns on the AJI's to cover the reports.
  • Avoid adding unnecessary columns to the AJI's.
  • The Optimizer doesn't join back to the tables defined on the AJI.
  • The Optimizer will only uses an AJI if it covers the query or if it can uses the AJI to join to other tables to cover the query.
  • Always define a multi-level AJI with the foreign key of the next level of the hierarchy.
  • A multi-level AJI can always go up (not down) on the hierarchy if it is defined properly.
  • The Optimizer always chooses an AJI at the same level or a lower level to cover a query.
  • Define a PPI/MPPI AJI if the Fat table is a PPI/MPPI table with the same PPI/MPPI expression.
  • A new PPI/MPPI AJI will need to be built to add new partitions to an existing PPI/MPPI AJI.
  • The Optimizer usually performs a “Product Join Enhanced by Dynamic Partition Elimination” when joining to a PPI/MPPI AJI.
  • Define the PI of an AJI based on the columns that provides the best distribution.
  • Consider a long maintenance window to build AJI’s especially with a large aggregation volume and/or a complex aggregation formula.
  • The aggregation step is usually 90% of the overall building time of an AJI.
  • The spool file of the aggregation step can use up to 5 times of the size of the Fat table. This means to build an AJI of a 2TB Fat table can use up to 10TB of spool space.

Load Process - Important Considerations

  • Teradata utilities can not be used on the reporting layer where the AJI’s are defined.
  • Consider creating three distinct load processes associated with the reporting layer:
    • Maintenance of reporting layer tables/columns not defined on the AJI’s.
    • Maintenance of reporting layer tables/columns defined on the AJI’s.

    • Maintenance of referential integrity.

  • Check for changes (insert, update, delete records) on the reporting layer tables/columns where the AJI’s are defined before applying it.
  • Consider creating working tables containing the insert, update delete records of the reporting layer tables/columns where the AJI’s are defined as part of the load process.
  • Consider having three distinct steps on the load process to insert, to update and to delete records on the reporting layer tables/columns where the AJI’s are defined.
  • The Optimizer performs two steps to maintain an AJI.
    • Merge Delete.
    • Merge Insert.
  • The Optimizer performs all join steps plus the aggregation step associated with a Multi-Level AJI twice to build two distinct spool files to apply the merge delete and the merge insert steps.
  • A multi-level AJI maintenance process usually takes a significant amount of time and system resources.
  • Every load process associated with any table defined on an AJI will trigger the AJI maintenance process.
  • Consider dropping AJI’s during large volume changes on the reporting layer tables/columns where the AJI’s are defined.
  • Consider running delete/drop partitions of the FAT tables where the AJI’s are defined once a week to minimize the impact on the system.
  • Consider dropping/re-creating AJI’s on a daily basis to not impact the daily load process if it is possible. But watch out for blocking on the system.

Study Case - Retail Industry

  • Global Reporting Solution on a Cognos/Teradata environment.
  • Three years of Inventory and Sales Detail data.
  • Year to date, Month to date, Week to date, Previous day, Last year/Current year, Rolling three years, and Ad Hoc reports.
  • New reporting layer (Semantic layer) including Inventory and Sales Detail historical tables and Item, Location and Line of Business hierarchies.
  • New Referential Integrity defined on all reporting tables to cover all levels of the reporting requirements.
  • New AJI’s defined on different levels to cover all reporting requirements.
  • Results

Reporting Layer – Dimensional Model – Hierarchies

Reporting Layer – Dimensional Model – Sales Detail

Reporting Layer – Dimensional Model – Inventory

Referential Integrity – Item

CREATE MULTISET TABLE RPT_ITEM 
(ITEM_ID INTEGER NOT NULL,
ITEM_SUBCLASS_ID INTEGER NOT NULL,
....
CONSTRAINT ISC1 FOREIGN KEY ( ITEM_SUBCLASS_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_SUBCLASS ( ITEM_SUBCLASS_ID ))
UNIQUE PRIMARY INDEX ( ITEM_ID )
INDEX ( ITEM_SUBCLASS_ID );

CREATE MULTISET TABLE RPT_ITEM_SUBCLASS 
(ITEM_SUBCLASS_ID INTEGER NOT NULL,
ITEM_CLASS_ID SMALLINT NOT NULL,
....
CONSTRAINT IC1 FOREIGN KEY ( ITEM_CLASS_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_CLASS ( ITEM_CLASS_ID ))
UNIQUE PRIMARY INDEX ( ITEM_SUBCLASS_ID )
INDEX RPT_ITEM_SUBCLASS_NUSI1 ( ITEM_CLASS_ID );

CREATE MULTISET TABLE RPT_ITEM_CLASS 
(ITEM_CLASS_ID SMALLINT NOT NULL,
ITEM_DEPARTMENT_ID SMALLINT NOT NULL,
....
CONSTRAINT ID1 FOREIGN KEY ( ITEM_DEPARTMENT_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_DEPARTMENT ( ITEM_DEPARTMENT_ID ))
UNIQUE PRIMARY INDEX ( ITEM_CLASS_ID )
INDEX RPT_ITEM_CLASS_NUSI1 ( ITEM_DEPARTMENT_ID );

CREATE MULTISET TABLE RPT_ITEM_DEPARTMENT 
(ITEM_DEPARTMENT_ID SMALLINT NOT NULL,
ITEM_GROUP_ID SMALLINT NOT NULL,
....
CONSTRAINT IG1 FOREIGN KEY ( ITEM_GROUP_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_GROUP ( ITEM_GROUP_ID ))
UNIQUE PRIMARY INDEX ( ITEM_DEPARTMENT_ID )
INDEX RPT_ITEM_DEPARTMENT_NUSI1 ( ITEM_GROUP_ID );

CREATE MULTISET TABLE RPT_ITEM_GROUP 
(ITEM_GROUP_ID SMALLINT NOT NULL,
ITEM_DIVISION_ID SMALLINT NOT NULL,
....
CONSTRAINT IDV1 FOREIGN KEY ( ITEM_DIVISION_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_DIVISION ( ITEM_DIVISION_ID ))
UNIQUE PRIMARY INDEX ( ITEM_GROUP_ID )
INDEX RPT_ITEM_GROUP_NUSI1 ( ITEM_DIVISION_ID );

CREATE MULTISET TABLE RPT_ITEM_DIVISION
(ITEM_DIVISION_ID   SMALLINT NOT NULL,
 COMPANY_ID   SMALLINT NOT NULL,
....
)
UNIQUE PRIMARY INDEX ( ITEM_DIVISION_ID )
INDEX RPT_ITEM_DIVISION_NUSI1 ( COMPANY_ID );

Referential Integrity – Location

CREATE MULTISET TABLE RPT_LOCATION 
(LOCATION_ID SMALLINT NOT NULL,
DISTRICT_ID SMALLINT NOT NULL,
....
CONSTRAINT LD1 FOREIGN KEY ( DISTRICT_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOC_DISTRICT ( DISTRICT_ID ))
UNIQUE PRIMARY INDEX ( LOCATION_ID )
INDEX ( DISTRICT_ID );

CREATE MULTISET TABLE RPT_LOC_DISTRICT 
(DISTRICT_ID SMALLINT NOT NULL,
REGION_ID SMALLINT NOT NULL,
....
CONSTRAINT LR1 FOREIGN KEY ( REGION_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOC_REGION ( REGION_ID ))
UNIQUE PRIMARY INDEX ( DISTRICT_ID )
INDEX RPT_LOC_DISTRICT_NUSI1 ( REGION_ID );

CREATE MULTISET TABLE RPT_LOC_REGION 
(REGION_ID SMALLINT NOT NULL,
AREA_ID SMALLINT NOT NULL,
....
CONSTRAINT LA1 FOREIGN KEY ( AREA_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOC_AREA ( AREA_ID ))
UNIQUE PRIMARY INDEX ( REGION_ID )
INDEX RPT_LOC_REGION_NUSI1 ( AREA_ID );

CREATE MULTISET TABLE RPT_LOC_AREA 
(AREA_ID SMALLINT NOT NULL,
CHAIN_ID SMALLINT NOT NULL,
....
CONSTRAINT LC1 FOREIGN KEY ( CHAIN_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOC_CHAIN ( CHAIN_ID ))
UNIQUE PRIMARY INDEX ( AREA_ID )
INDEX RPT_LOC_AREA_NUSI1 ( CHAIN_ID );

CREATE MULTISET TABLE RPT_LOC_CHAIN 
(CHAIN_ID SMALLINT NOT NULL,
COMPANY_ID SMALLINT NOT NULL,
....
CONSTRAINT LCO1 FOREIGN KEY ( COMPANY_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOC_COMPANY ( COMPANY_ID ))
UNIQUE PRIMARY INDEX ( CHAIN_ID )
INDEX ( COMPANY_ID );

CREATE MULTISET TABLE RPT_LOC_COMPANY 
(COMPANY_ID SMALLINT NOT NULL,
....
)
UNIQUE PRIMARY INDEX ( COMPANY_ID );

Referential Integrity – Line of Business

CREATE MULTISET TABLE RPT_LOB_BASE 
(RPT_LOB_ID CHAR(2) NOT NULL,
SALES_LOCATION_ID SMALLINT NOT NULL,
RPT_LOB_CHILD_ID CHAR(2) NOT NULL,
....
CONSTRAINT LC1 FOREIGN KEY ( RPT_LOB_CHILD_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOB_CHILD ( RPT_LOB_CHILD_ID ))
UNIQUE PRIMARY INDEX ( SALES_LOCATION_ID ,LOB_ID )
INDEX ( RPT_LOB_CHILD_ID );

CREATE MULTISET TABLE RPT_LOB_CHILD 
(RPT_LOB_CHILD_ID CHAR(2) NOT NULL,
RPT_LOB_SUBPARENT_ID CHAR(2) NOT NULL,
....
CONSTRAINT LS1 FOREIGN KEY ( RPT_LOB_SUBPARENT_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOB_SUBPARENT ( RPT_LOB_SUBPARENT_ID ))
UNIQUE PRIMARY INDEX ( RPT_LOB_CHILD_ID )
INDEX ( RPT_LOB_SUBPARENT_ID );

CREATE MULTISET TABLE RPT_LOB_SUBPARENT 
(RPT_LOB_SUBPARENT_ID CHAR(2) NOT NULL,
RPT_LOB_PARENT_ID CHAR(2) NOT NULL,
....
CONSTRAINT LP1 FOREIGN KEY ( RPT_LOB_PARENT_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOB_PARENT ( RPT_LOB_PARENT_ID ))
UNIQUE PRIMARY INDEX ( RPT_LOB_SUBPARENT_ID )
INDEX ( RPT_LOB_PARENT_ID );

CREATE MULTISET TABLE RPT_LOB_PARENT 
(RPT_LOB_PARENT_ID CHAR(2) NOT NULL,
....
)
UNIQUE PRIMARY INDEX ( RPT_LOB_PARENT_ID );

Referential Integrity – Sales Detail

CREATE MULTISET TABLE SALES_DETAIL 
(CUST_ORD_ID CHAR(24) NOT NULL,
BUS_UNIT_CD CHAR(2) NOT NULL,
BUSINESSDAY_DT INTEGER NOT NULL,
ITEM_ID INTEGER NOT NULL,
SALES_LOCATION_ID SMALLINT NOT NULL,
INVENTORY_LOCATION_ID SMALLINT NOT NULL,
VENDOR_ID INTEGER NOT NULL,
....
CONSTRAINT ITM1 FOREIGN KEY ( ITEM_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM ( ITEM_ID ),
CONSTRAINT LOC2 FOREIGN KEY ( SALES_LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOCATION ( LOCATION_ID ),
CONSTRAINT CLD3 FOREIGN KEY ( BUSINESSDAY_DT ,SALES_LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_COMP_LOCATION_DAY (FISCAL_DATE_ID ,LOCATION_ID ),
CONSTRAINT CLD4 FOREIGN KEY ( BUSINESSDAY_DT ,SALES_LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_COMP_LOCATION_DAY ( FISCAL_DATE_LAST_YR_ID ,LOCATION_ID ),
CONSTRAINT LB5 FOREIGN KEY ( BUS_UNIT_CD ,SALES_LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOB_BASE ( LOB_ID ,SALES_LOCATION_ID ),
CONSTRAINT IXLV6 FOREIGN KEY ( ITEM_ID ,SALES_LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_X_LOCATION_VENDOR ( ITEM_ID, LOCATION_ID ),
CONSTRAINT IXLV7 FOREIGN KEY ( ITEM_ID ,INVENTORY_LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_X_LOCATION_VENDOR ( ITEM_ID ,LOCATION_ID ))
PRIMARY INDEX SALES_DETAIL_NUPI ( CUST_ORD_ID )
PARTITION BY RANGE_N(BUSINESSDAY_DT BETWEEN
2008011 AND 2008527 EACH 1 ,
2009011 AND 2009527 EACH 1 ,
2010011 AND 2010527 EACH 1 ,
2011011 AND 2011527 EACH 1 ,
2011011 AND 2011527 EACH 1 ,
NO RANGE);

Referential Integrity – Inventory

CREATE MULTISET TABLE RPT_LOC_INV_HST
(LOCATION_ID SMALLINT NOT NULL,
ITEM_ID INTEGER NOT NULL,
INV_DT DATE FORMAT 'YYYY-MM-DD' NOT NULL,
....
CONSTRAINT ITM1 FOREIGN KEY ( ITEM_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM ( ITEM_ID ),
CONSTRAINT LOC2 FOREIGN KEY ( LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOCATION ( LOCATION_ID ),
CONSTRAINT CLD3 FOREIGN KEY ( LOCATION_ID ,INV_DT ) REFERENCES WITH NO CHECK OPTION RPT_COMP_LOCATION_DAY ( LOCATION_ID ,DT ),
CONSTRAINT CLD4 FOREIGN KEY ( LOCATION_ID ,INV_DT ) REFERENCES WITH NO CHECK OPTION RPT_COMP_LOCATION_DAY ( LOCATION_ID ,FISCAL_DATE_LAST_YR_DT ),
CONSTRAINT IXLV5 FOREIGN KEY ( LOCATION_ID ,ITEM_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM_X_LOCATION_VENDOR ( LOCATION_ID ,ITEM_ID ))
PRIMARY INDEX ( LOCATION_ID ,ITEM_ID )
PARTITION BY RANGE_N(INV_DT BETWEEN
'2007-08-07' AND '2008-06-21' EACH INTERVAL '7' DAY ,
'2008-06-22' AND '2012-01-01' EACH INTERVAL '1' DAY ,
NO RANGE);

Referential Integrity – Sales Detail & Inventory

CREATE MULTISET TABLE RPT_ITEM_X_LOCATION_VENDOR 
(ITEM_ID INTEGER NOT NULL,
LOCATION_ID SMALLINT NOT NULL,
....
CONSTRAINT ITM1 FOREIGN KEY ( ITEM_ID ) REFERENCES WITH NO CHECK OPTION RPT_ITEM ( ITEM_ID ),
CONSTRAINT LOC2 FOREIGN KEY ( LOCATION_ID ) REFERENCES WITH NO CHECK OPTION RPT_LOCATION ( LOCATION_ID ))
UNIQUE PRIMARY INDEX ( ITEM_ID ,LOCATION_ID );

CREATE MULTISET TABLE RPT_COMP_LOCATION_DAY 
(LOCATION_ID SMALLINT NOT NULL,
COMP_SALES_IND CHAR(1) NOT NULL,
FISCAL_DATE_ID INTEGER NOT NULL,
FISCAL_DATE_LAST_YR_ID INTEGER NOT NULL,
DT DATE FORMAT 'YYYY-MM-DD' NOT NULL,
FISCAL_DATE_LAST_YR_DT DATE FORMAT 'YYYY-MM-DD' NOT NULL,
....
CONSTRAINT CLDC1 FOREIGN KEY ( COMP_SALES_IND ,DT ) REFERENCES WITH NO CHECK OPTION RPT_COMP_LOCATION_DAY_CONV ( COMP_SALES_IND ,DT ),
CONSTRAINT CLDC2 FOREIGN KEY ( COMP_SALES_IND ,FISCAL_DATE_ID ) REFERENCES WITH NO CHECK OPTION RPT_COMP_LOCATION_DAY_CONV ( COMP_SALES_IND ,FISCAL_DATE_ID ))
UNIQUE PRIMARY INDEX ( LOCATION_ID ,DT );

CREATE MULTISET TABLE RPT_COMP_LOCATION_DAY_CONV 
(COMP_SALES_IND CHAR(1) NOT NULL,
FISCAL_DATE_ID INTEGER NOT NULL,
DT DATE FORMAT 'YYYY-MM-DD' NOT NULL,
....
CONSTRAINT CSC1 FOREIGN KEY ( COMP_SALES_CD ) REFERENCES WITH NO CHECK OPTION RPT_COMP_SALES_CODE ( COMP_SALES_CD ))
UNIQUE PRIMARY INDEX ( COMP_SALES_IND ,DT )
UNIQUE INDEX COMP_LOCATION_DAY_USI1 ( COMP_SALES_IND ,FISCAL_DATE_ID );

Single/Multi-Level AJI's – Sales Detail

  • Single Level AJI – Sales Detail
CREATE JOIN INDEX SALES_DETAIL_AJI1 
AS
SELECT COUNT(*)(FLOAT, NAMED CountStar )
,SALES_LOCATION_ID
,INVENTORY_LOCATION_ID
,ITEM_ID
,BUSINESSDAY_DT
,VENDOR_ID
,SLS_TYP_CD
....
FROM SALES_DETAIL
GROUP BY ....
PRIMARY INDEX ( SALES_LOCATION_ID ,ITEM_ID ,BUSINESSDAY_DT )
PARTITION BY RANGE_N(BUSINESSDAY_DT BETWEEN
2008011 AND 2008527 EACH 1 ,
2009011 AND 2009527 EACH 1 ,
2010011 AND 2010527 EACH 1 ,
2011011 AND 2011527 EACH 1 ,
NO RANGE)
INDEX ( SALES_LOCATION_ID ,ITEM_ID )
INDEX ( INVENTORY_LOCATION_ID ,ITEM_ID )
INDEX ( ITEM_ID ,BUSINESSDAY_DT )
INDEX ( INVENTORY_LOCATION_ID ,BUSINESSDAY_DT )
INDEX ( SALES_LOCATION_ID ,BUSINESSDAY_DT )
INDEX ( BUSINESSDAY_DT ,VENDOR_ID );
  • Multi-Level AJI – Item/Location Region/LOB Child Level
CREATE JOIN INDEX SALES_DETAIL_AJI2
AS
SELECT COUNT(*)(FLOAT, NAMED CountStar )
,LR.REGION_ID
,LR.AREA_ID
,SD.ITEM_ID
,LC.RPT_LOB_SUBPARENT_ID
,SD.SLS_TYP_CD
,SD.BUSINESSDAY_DT
,SD.VENDOR_ID
....
FROM SALES_DETAIL SD ,RPT_LOB_BASE LB
,RPT_LOB_CHILD LC ,RPT_LOCATION LOC
,RPT_LOC_DISTRICT LD ,RPT_LOC_REGION LR
,RPT_ITEM_X_LOCATION_VENDOR IXL
,RPT_COMP_LOCATION_DAY COMP
,RPT_COMP_LOCATION_DAY_CONV CONV
WHERE SD.BUS_UNIT_CD = LB.LOB_ID
AND SD.SALES_LOCATION_ID = LB.SALES_LOCATION_ID
AND LB.RPT_LOB_CHILD_ID = LC.RPT_LOB_CHILD_ID
AND SD.SALES_LOCATION_ID = LOC.LOCATION_ID
AND LOC.DISTRICT_ID = LD.DISTRICT_ID
AND LD.REGION_ID = LR.REGION_ID
AND SD.ITEM_ID = IXL.ITEM_ID
AND SD.SALES_LOCATION_ID = IXL.LOCATION_ID
AND SD.SALES_LOCATION_ID = COMP.LOCATION_ID
AND SD.BUSINESSDAY_DT = COMP.FISCAL_DATE_ID
AND CONV.COMP_SALES_IND = COMP.COMP_SALES_IND
AND CONV.DT = COMP.DT
GROUP BY ....
PRIMARY INDEX ( REGION_ID, AREA_ID ,ITEM_ID ,BUSINESSDAY_DT ,SLS_TYP_CD )
PARTITION BY RANGE_N(BUSINESSDAY_DT BETWEEN
2008011 AND 2008527 EACH 1 ,
2009011 AND 2009527 EACH 1 ,
2010011 AND 2010527 EACH 1 ,
2011011 AND 2011527 EACH 1 ,
NO RANGE)
INDEX ( ITEM_ID ,BUSINESSDAY_DT );
  • Multi-Level AJI – Item Subclass/Location/LOB Child Level
CREATE JOIN INDEX SALES_DETAIL_AJI3 
AS
SELECT COUNT(*)(FLOAT, NAMED COUNTSTAR )
,LOC.DISTRICT_ID
,ISC.ITEM_SUBCLASS_ID
,ISC.ITEM_CLASS_ID
,LC.RPT_LOB_SUBPARENT_ID
,LOC.LOCATION_SOURCE_REF_NUM
,SD.BUSINESSDAY_DT
,SD.VENDOR_ID
,SD.SLS_TYP_CD
....
FROM SALES_DETAIL SD ,RPT_LOB_BASE LB ,
RPT_LOB_CHILD LC ,RPT_ITEM ITM ,
RPT_ITEM_SUBCLASS ISC ,RPT_LOCATION LOC ,
RPT_ITEM_X_LOCATION_VENDOR IXL ,
RPT_COMP_LOCATION_DAY COMP ,
RPT_COMP_LOCATION_DAY_CONV CONV
WHERE SD.BUS_UNIT_CD = LB.LOB_ID
AND SD.SALES_LOCATION_ID = LB.SALES_LOCATION_ID
AND LB.RPT_LOB_CHILD_ID = LC.RPT_LOB_CHILD_ID
AND ITM.ITEM_ID = SD.ITEM_ID
AND ISC.ITEM_SUBCLASS_ID = ITM.ITEM_SUBCLASS_ID
AND SD.SALES_LOCATION_ID = LOC.LOCATION_ID
AND SD.ITEM_ID = IXL.ITEM_ID
AND SD.SALES_LOCATION_ID = IXL.LOCATION_ID
AND SD.SALES_LOCATION_ID = COMP.LOCATION_ID
AND SD.BUSINESSDAY_DT = COMP.FISCAL_DATE_ID
AND CONV.COMP_SALES_IND = COMP.COMP_SALES_IND
AND CONV.DT = COMP.DT
GROUP BY ....
PRIMARY INDEX ( LOCATION_SOURCE_REF_NUM ,DISTRICT_ID ,ITEM_SUBCLASS_ID
,ITEM_CLASS_ID ,BUSINESSDAY_DT ,VENDOR_ID ,SLS_TYP_CD )
PARTITION BY RANGE_N(BUSINESSDAY_DT BETWEEN
2008011 AND 2008527 EACH 1 ,
2009011 AND 2009527 EACH 1 ,
2010011 AND 2010527 EACH 1 ,
2011011 AND 2011527 EACH 1 ,
NO RANGE)
INDEX ( ITEM_CLASS_ID ,BUSINESSDAY_DT );
  • Multi-Level AJI – Item Subclass/Location Region/LOB Child Level
CREATE JOIN INDEX SALES_DETAIL_AJI4 
AS
SELECT COUNT(*)(FLOAT, NAMED CountStar )
,LR.REGION_ID
,LR.AREA_ID
,ISC.ITEM_SUBCLASS_ID
,ISC.ITEM_CLASS_ID
,LC.RPT_LOB_SUBPARENT_ID
,SD.BUSINESSDAY_DT
,SD.VENDOR_ID
,SD.SLS_TYP_CD
....
FROM SALES_DETAIL SD ,RPT_LOB_BASE LB ,
RPT_LOB_CHILD LC ,RPT_ITEM ITM ,
RPT_ITEM_SUBCLASS ISC ,RPT_LOCATION LOC ,
RPT_LOC_DISTRICT LD ,RPT_LOC_REGION LR ,
RPT_ITEM_X_LOCATION_VENDOR IXL ,
RPT_COMP_LOCATION_DAY COMP ,
RPT_COMP_LOCATION_DAY_CONV CONV
WHERE SD.BUS_UNIT_CD = LB.LOB_ID
AND SD.SALES_LOCATION_ID = LB.SALES_LOCATION_ID
AND LB.RPT_LOB_CHILD_ID = LC.RPT_LOB_CHILD_ID
AND ITM.ITEM_ID = SD.ITEM_ID
AND ISC.ITEM_SUBCLASS_ID = ITM.ITEM_SUBCLASS_ID
AND SD.SALES_LOCATION_ID = LOC.LOCATION_ID
AND LOC.DISTRICT_ID = LD.DISTRICT_ID
AND LD.REGION_ID = LR.REGION_ID
AND SD.ITEM_ID = IXL.ITEM_ID
AND SD.SALES_LOCATION_ID = IXL.LOCATION_ID
AND SD.SALES_LOCATION_ID = COMP.LOCATION_ID
AND SD.BUSINESSDAY_DT = COMP.FISCAL_DATE_ID
AND CONV.COMP_SALES_IND = COMP.COMP_SALES_IND
AND CONV.DT = COMP.DT
GROUP BY ....
PRIMARY INDEX ( REGION_ID ,AREA_ID ,ITEM_SUBCLASS_ID ,
ITEM_CLASS_ID ,BUSINESSDAY_DT ,VENDOR_ID ,SLS_TYP_CD )
PARTITION BY RANGE_N(BUSINESSDAY_DT BETWEEN
2008011 AND 2008527 EACH 1 ,
2009011 AND 2009527 EACH 1 ,
2010011 AND 2010527 EACH 1 ,
2011011 AND 2011527 EACH 1 ,
NO RANGE)
INDEX ( ITEM_CLASS_ID ,BUSINESSDAY_DT );

Multi-Level AJI's – Inventory

  • Multi-Level AJI – Item/Location Region Level
CREATE JOIN INDEX RPT_LOC_INV_HST_AJI2
AS
SELECT COUNT(*)(FLOAT, NAMED CountStar )
,LR.REGION_ID
,LR.AREA_ID
,INV.ITEM_ID
,INV.INV_DT
,IXL.REPLENISH_TYPE_CD
,IXL.VENDOR_ID
....
FROM RPT_LOC_INV_HST INV ,RPT_LOCATION LOC ,
RPT_LOC_DISTRICT LD ,RPT_LOC_REGION LR ,
RPT_ITEM_X_LOCATION_VENDOR IXL ,
RPT_COMP_LOCATION_DAY COMP ,
RPT_COMP_LOCATION_DAY_CONV CONV
WHERE INV.LOCATION_ID = LOC.LOCATION_ID
AND LOC.DISTRICT_ID = LD.DISTRICT_ID
AND LD.REGION_ID = LR.REGION_ID
AND INV.ITEM_ID = IXL.ITEM_ID
AND INV.LOCATION_ID = IXL.LOCATION_ID
AND INV.LOCATION_ID = COMP.LOCATION_ID
AND INV.INV_DT = COMP.DT
AND CONV.COMP_SALES_IND = COMP.COMP_SALES_IND
AND CONV.DT = COMP.DT
GROUP BY ....
PRIMARY INDEX ( REGION_ID ,AREA_ID ,ITEM_ID ,
INV_DT ,REPLENISH_TYPE_CD ,VENDOR_ID )
PARTITION BY RANGE_N(INV_DT BETWEEN
'2007-08-07' AND '2008-06-21' EACH INTERVAL '7' DAY ,
'2008-06-22' AND '2012-01-01' EACH INTERVAL '1' DAY ,
NO RANGE);
  • Multi-Level AJI – Item Subclass/Location Level
CREATE JOIN INDEX RPT_LOC_INV_HST_AJI3 
AS
SELECT COUNT(*)(FLOAT, NAMED CountStar )
,LOC.DISTRICT_ID
,ISC.ITEM_SUBCLASS_ID
,ISC.ITEM_CLASS_ID
,INV.INV_DT
,LOC.LOCATION_SOURCE_REF_NUM
,IXL.REPLENISH_SUBTYPE_CD
,IXL.VENDOR_ID
,CONV.COMP_SALES_CD
....
FROM RPT_LOC_INV_HST INV ,RPT_LOCATION LOC ,
RPT_ITEM ITM ,RPT_ITEM_SUBCLASS ISC ,
RPT_ITEM_X_LOCATION_VENDOR IXL ,
RPT_COMP_LOCATION_DAY COMP ,
RPT_COMP_LOCATION_DAY_CONV CONV
WHERE INV.LOCATION_ID = LOC.LOCATION_ID
AND INV.ITEM_ID = ITM.ITEM_ID
AND ITM.ITEM_SUBCLASS_ID = ISC.ITEM_SUBCLASS_ID
AND INV.ITEM_ID = IXL.ITEM_ID
AND INV.LOCATION_ID = IXL.LOCATION_ID
AND INV.LOCATION_ID = COMP.LOCATION_ID
AND INV.INV_DT = COMP.DT
AND CONV.COMP_SALES_IND = COMP.COMP_SALES_IND
AND CONV.DT = COMP.DT
GROUP BY ....
PRIMARY INDEX ( LOCATION_SOURCE_REF_NUM ,DISTRICT_ID ,INV_DT ,
ITEM_SUBCLASS_ID ,ITEM_CLASS_ID ,REPLENISH_SUBTYPE_CD ,VENDOR_ID )
PARTITION BY RANGE_N(INV_DT BETWEEN
'2007-08-07' AND '2008-06-21' EACH INTERVAL '7' DAY ,
'2008-06-22' AND '2012-01-01' EACH INTERVAL '1' DAY ,
NO RANGE)
INDEX ( DISTRICT_ID ,INV_DT )
INDEX ( COMP_SALES_CD ,INV_DT )
INDEX ( DISTRICT_ID ,COMP_SALES_CD ,INV_DT );
  • Multi-Level AJI – Item Class/Location Region Level
CREATE JOIN INDEX RPT_LOC_INV_HST_AJI4
AS
SELECT COUNT(*)(FLOAT, NAMED CountStar )
,LR.REGION_ID
,LR.AREA_ID
,IC.ITEM_CLASS_ID
,IC.ITEM_DEPARTMENT_ID
,INV.INV_DT
,IXL.REPLENISH_TYPE_CD
,IXL.VENDOR_ID
,CONV.COMP_SALES_CD
....
FROM RPT_LOC_INV_HST INV ,RPT_LOCATION LOC ,
RPT_LOC_DISTRICT LD ,RPT_LOC_REGION LR
RPT_ITEM ITM ,RPT_ITEM_SUBCLASS ISC ,
RPT_ITEM_CLASS IC ,
RPT_ITEM_X_LOCATION_VENDOR IXL ,
RPT_COMP_LOCATION_DAY COMP ,
RPT_COMP_LOCATION_DAY_CONV CONV ,
WHERE INV.LOCATION_ID = LOC.LOCATION_ID
AND LOC.DISTRICT_ID = LD.DISTRICT_ID
AND LD.REGION_ID = LR.REGION_ID
AND INV.ITEM_ID = ITM.ITEM_ID
AND ITM.ITEM_SUBCLASS_ID = ISC.ITEM_SUBCLASS_ID
AND ISC.ITEM_CLASS_ID = IC.ITEM_CLASS_ID
AND INV.ITEM_ID = IXL.ITEM_ID
AND INV.LOCATION_ID = IXL.LOCATION_ID
AND INV.LOCATION_ID = COMP.LOCATION_ID
AND INV.INV_DT = COMP.DT
AND CONV.COMP_SALES_IND = COMP.COMP_SALES_IND
AND CONV.DT = COMP.DT
GROUP BY ....
PRIMARY INDEX ( REGION_ID ,AREA_ID ,ITEM_CLASS_ID ,
ITEM_DEPARTMENT_ID ,INV_DT ,REPLENISH_SUBTYPE_CD ,VENDOR_ID )
PARTITION BY RANGE_N(INV_DT BETWEEN
'2007-08-07' AND '2008-06-21' EACH INTERVAL '7' DAY ,
'2008-06-22' AND '2012-01-01' EACH INTERVAL '1' DAY ,
NO RANGE);

Results

  • Total elapsed time to build the AJI's
SALES
SALES_DETAIL_AJI1 - (262 Gig)
*** Index has been created.
*** Total elapsed time was 6 hours, 31 minutes and 59.99 seconds.
 
SALES_DETAIL_AJI2 - (30 Gig)
*** Index has been created.
*** Total elapsed time was 4 hours, 5 minutes and 33.13 seconds.
 
SALES_DETAIL_AJI3 - (116 Gig)
*** Index has been created.
*** Total elapsed time was one hour, 20 minutes and 16.04 seconds.
 
SALES_DETAIL_AJI4 - (10Gig)
*** Index has been created.
*** Total elapsed time was 12 minutes and 9.11 seconds.


INVENTORY
RPT_LOC_INV_HST_AJI2 - (106 Gig)
*** Index has been created.
*** Total elapsed time was 24 hours, 47 minutes and 39.99 seconds.
 
RPT_LOC_INV_HST_AJI3 - (702 Gig)
*** Index has been created.
*** Total elapsed time was 16 hours, 11 minutes and 51.39 seconds.
 
RPT_LOC_INV_HST_AJI4 - (10 Gig)
*** Index has been created.
*** Total elapsed time was 23 minutes and 41.99 seconds.
  • AJI's Hits Report from DBQLObjTbl table

  • CPU Buckets Report associated with all Cognos reports from DBQLogTbl table

  • Critical reports that were timing out before are now finishing inside of the SLG time window.


  • The total number of queries per month increased by a factor of 6 after the implementation of the AJI’s.
  • There was a significant increase on the overall monthly CPU utilization due to the additional load processes to maintain the AJI’s and the increase of the total number of queries per month.

11 Comments
Can you provide sample queries that were able to utilize the AJI's indicated?
Advantage of Aggregate Join Indexes is that it supports Multidimensional Businesses.

kingman arizona real estate
Can you please tell me how did you get AJI's Hits Report from DBQLObjTbl table?

If possible, please share the query. Thanks.
I am sorry but I can't provide sample queries. But the majority of the queries are coming from COGNOS.
Here is the query to generate AJI's Hits Report from the historical DBQLObjTbl table.

LOCK ROW FOR ACCESS
SELECT
CASE
WHEN LogDate BETWEEN DATE '2010-01-01' AND DATE '2010-01-31' THEN '2010-01'
WHEN LogDate BETWEEN DATE '2010-02-01' AND DATE '2010-02-28' THEN '2010-02'
WHEN LogDate BETWEEN DATE '2010-03-01' AND DATE '2010-03-31' THEN '2010-03'
WHEN LogDate BETWEEN DATE '2010-04-01' AND DATE '2010-04-30' THEN '2010-04'
WHEN LogDate BETWEEN DATE '2010-05-01' AND DATE '2010-05-31' THEN '2010-05'
WHEN LogDate BETWEEN DATE '2010-06-01' AND DATE '2010-06-30' THEN '2010-06'
WHEN LogDate BETWEEN DATE '2010-07-01' AND DATE '2010-07-31' THEN '2010-07'
WHEN LogDate BETWEEN DATE '2010-08-01' AND DATE '2010-08-31' THEN '2010-08'
WHEN LogDate BETWEEN DATE '2010-09-01' AND DATE '2010-09-30' THEN '2010-09'
WHEN LogDate BETWEEN DATE '2010-10-01' AND DATE '2010-10-31' THEN '2010-10'
WHEN LogDate BETWEEN DATE '2010-11-01' AND DATE '2010-11-30' THEN '2010-11'
WHEN LogDate BETWEEN DATE '2010-12-01' AND DATE '2010-12-31' THEN '2010-12'
END (CHAR(7)) themonth
,ObjectDatabaseName
,ObjectTableName
,SUM(FreqofUse) AS FreqofUse
FROM DATABASE.DBQLOBJTBL_HST
WHERE ObjectDatabaseName IN ( 'Database1','Database2' )
AND ObjectTableName IN
( 'RPT_LOC_INV_HST_AJI2'
,'RPT_LOC_INV_HST_AJI3'
,'RPT_LOC_INV_HST_AJI4'
,'SALES_DETAIL_AJI1'
,'SALES_DETAIL_AJI2'
,'SALES_DETAIL_AJI3'
,'SALES_DETAIL_AJI4'
)
AND LogDate BETWEEN DATE '2010-01-01' AND DATE '2010-12-31'
AND ObjectType = 'J'
GROUP BY 1,2,3
ORDER BY 1,2,3
;
mbw
N/A
Hi,

One question regarding the AJI maintenance process: Is there a way of disabling automatic triggering of the maintenance process and rather do the maintenance on demand? The reason I ask is that I have base tables that I need to update every hour which is causing the AJI maintenance process to start and causing some problems. I would rather like to start the maintenance process once every day.

Thanks,
Magnus
There is no option to disable AJI maintenance unless you drop the AJI before loading and recreate after the load is done.
Another option is to create a report/semantic layer and have the AJI defined on the reporting table instead of the base table.
Hi, thanks a lot for such queries it will help me lot .

for lease by owner
Enthusiast

Well explained article. Quick question though.

What if we use a function while aggregating. For instance SUM(ZEROIFNULL(<column_name>))

The optimizer seems to ignore the AJI.

The following uses the AJI

SELECT SUM(c4)

FROM

(SELECT (CASE    WHEN Equipment_Active_Status_Code <> 'D'

        AND  Category_Code > 99

        AND  Equipment_Type_Code IN('R' ,'S')

        AND  Equipment_Status_Code IN('O' ,'P')

        THEN Equipment_Cost_Amt * CG_CURRENCY_CONV_RATE.C_USD_Rate

        ELSE  0

    END) C4

.

This query is optimized using type 2 profile T2_Linux64, profileid 21.

  1) First, we lock JI_M_EQUIP_DAILY_FACT for access, we

     lock model_FIN_ORGANIZATION for access.

  2) Next, we execute the following steps in parallel.

.

The following does not use the AJI

SELECT SUM(ZEROIFNULL(c4))

FROM

(SELECT (CASE    WHEN Equipment_Active_Status_Code <> 'D'

        AND  Category_Code > 99

        AND  Equipment_Type_Code IN('R' ,'S')

        AND  Equipment_Status_Code IN('O' ,'P')

        THEN Equipment_Cost_Amt * CG_CURRENCY_CONV_RATE.C_USD_Rate

        ELSE  0

    END) C4

.

 This query is optimized using type 2 profile T2_Linux64, profileid 21.

  1) First, we lock M_EQUIPMENT_DAILY_FACT for access,

     we lock model_FIN_ORGANIZATION for access and we lock

     CURRENCY_CONV_RATE for access.

  2) Next, we execute the following steps in parallel.

.

I tried creating the AJI with the ZEROIFNULL for the column, but then the optimizer completely ignores the AJI for all the queries.

We are on TD 13.10.05.02.

Sanjeev

Enthusiast

If the JI has current_Date in it, what is the best way to keep it updated? Do I need to drop it and recreate the JI every single day? Can I incrementally update such a JI?

-Suhail

Enthusiast

Hello,

I would like to ask three questions about JI.

Let's assume that we have one fact table and two dimension.

We have implement everything as you said , RI on fact , unique PI on domains.

We just join these three tables   , on RI defined columns  , we aggregate on two column on fact table and we take 2 desc columns from each dimension.

My 1st question , since the spool will be enormous , by defining the accurate length of varchar() columns   , add the maximum MVC on them , is it possible to decrease the metrics (cpu/io/spool and so on)?

My second question regards if it is a good practice to define directly the possible indexes on the creation of the index or it is better after?

Ending , by setting a where clause on date , sparse join index , in order to limit the data , after creation all reports must contain a where clause on this column in order to access JI? Or , TD will understund it.

Thank you very much for your time.