SQL to query 3NF "Traited" Dimension

Database

SQL to query 3NF "Traited" Dimension

Hello experts

We are designing a dimension of ITEMS (in a Point of sales DWH) as a table with Name-Value Pairs i.e. ITEM attributes stored as ROWS instead of COLUMNS.This ITEM table is joined to a fact table containing SALES $ and QUANTITY.

CREATE TABLE ITEM_TRAIT
(ITEM_KEY INTEGER NOT NULL,
TRAIT_CODE CHAR(10),
TRAIT_VALUE VARCHAR(100))
unique primary index (ITEM_KEY,TRAIT_CODE);

collect stats BAA_D_R2W_QA_WKY_SEM_N_01. ITEM_TRAIT column(item_key);
collect stats BAA_D_R2W_QA_WKY_SEM_N_01. ITEM_TRAIT column(trait_code);
collect stats BAA_D_R2W_QA_WKY_SEM_N_01. ITEM_TRAIT column(item_key,trait_code);
INSERT into ITEM_TRAIT Values('111','BRAND','SHAMPOO');
INSERT into ITEM_TRAIT Values('111','DESC','Shampoo 16OZ');

++++SQL to query ITEM/FACT+++
SELECT
CAL.WEEK AS WEEK,
(case when item.trait_code='BRAND' then trait_value end ) AS BRAND,
SUM(FACT.POSSLS) AS POSSALES,
SUM(FACT.POSQTY) AS POSQTY
FROM
FACT,
CALENDAR CAL,
ITEM_TRAIT ITEM
WHERE
FACT.PERIOD_KEY = CAL.PERIOD_KEY
AND FACT.ITEM_KEY=ITEM.ITEM_KEY
AND CAL.PERIOD_KEY between 20100101 and 20100228
AND POSSLS <> 0
GROUP BY
1,2
++++++++++++++++++++++++++
  • So for 10000 ITEMS each with 30 attributes,this table contains 90K rows.
  • Typical queries ask for sales/quantity (from fact) grouped by BRAND for a time period
  1. How does the optimiser know that the ITEM table contains repeated ITEM_KEY?
  2. When ITEM table joins to FACT,the optimiser does not seem to "know" that BRAND is an attribute,unlike a normal dimension and seems to incur a lot more I/O ...This is compared to a similar SQL to a "normal" ITEM table and the response time  and I/O are significantly less.

TIA