Query on Big Table sized approx 1 TB

Teradata Applications
Enthusiast

Query on Big Table sized approx 1 TB

HI,

 I have a very big fact table named F_BASE_SALES_TRANS_LINE and the size as aprrox 1 TB. This table contains detailed dat at

calendar_date, ITem_id, Store_num, TRANS_line_id... You can say the smallest granularity of each transaction at Retail store.

  Table structure is below..

CREATE MULTISET TABLE F_BASE_SALES_TRANS_LINE ,NO FALLBACK ,

NO BEFORE JOURNAL,

NO AFTER JOURNAL,

CHECKSUM = DEFAULT,

DEFAULT MERGEBLOCKRATIO

(

SK_SALES_TRANS_LINE_ID DECIMAL(18,0) NOT NULL,

BK_ITEM_ID DECIMAL(18,0),

BK_STORE_ID INTEGER,

SK_SALES_TRANS_ID DECIMAL(18,0),

SK_POS_REG_ID DECIMAL(18,0),

SK_ITEM_GRP_PLAN_ID DECIMAL(18,0) COMPRESS -1. ,

SK_POS_ENTER_MTH_ID DECIMAL(18,0) COMPRESS (0. ,1. ,2. ,3. ,-1. ),

SK_UNIT_OF_MEASURE_ID DECIMAL(18,0) COMPRESS (18. ,33. ),

SK_TIME_DAY_ID INTEGER,

SK_TIME_DAY_LY_ID INTEGER,

SK_TIME_SECOND_ID CHAR(6) CHARACTER SET LATIN NOT CASESPECIFIC,

SK_PERIOD_ID INTEGER,

SK_PERIOD_LY_ID INTEGER,

SALES_TRANS_LINE_NUM INTEGER,

ID_NUM_ID DECIMAL(18,0),

ID_NUM_FORMAT_CD VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ('PLU','VGR','DSK','KIWI','*'),

POS_SCAN_CODE_NUM VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC,

POS_PROMO_IND CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ('0','1'),

SK_PROMO_OFFER_ID DECIMAL(18,0) COMPRESS -1. ,

PROMO_STATUS_CD BYTEINT COMPRESS (0 ,1 ,2 ,3 ,-1 ),

ITEM_DESC VARCHAR(250) CHARACTER SET LATIN NOT CASESPECIFIC,

SALES_LINE_ID_TYPE_CD BYTEINT COMPRESS (0 ,1 ,2 ,-1 ),

SK_DISC_TYPE_ID DECIMAL(18,0) COMPRESS -1. ,

SALES_TRANS_LINE_DTTM TIMESTAMP(6),

ITEM_QTY DECIMAL(18,4) COMPRESS (1.0000 ,2.0000 ,3.0000 ,4.0000 ,5.0000 ,6.0000 ,7.0000 ,8.0000 ,9.0000 ,10.0000 ,-1.0000 ),

ITEM_WEIGHT DECIMAL(18,4) COMPRESS (0.0000 ,1.0000 ,2.0000 ,50.0000 ,42.5000 ,20.0000 ,4.0000 ,5.0000 ,40.0000 ,10.0000 ,-1.0000 ),

LINE_AMT DECIMAL(18,4),

LINE_VAT_AMT DECIMAL(18,4),

UNIT_LIST_PRICE_AMT DECIMAL(18,4),

COST_AMT DECIMAL(18,4),

LINE_DISC_AMT DECIMAL(18,4) COMPRESS 0.0000 ,

VAL_STATUS_CD VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ('VA','VM','NY','CM_ST_INV','SA','MA','CM','CA'),

SYS_VALIDATED_DTTM TIMESTAMP(6),

SYS_LAST_UPDATED_DTTM TIMESTAMP(6),

SYS_SRC_TABLE_NAME VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS 'ARC_KIW_TRANSSALESENTRY',

SYS_LOAD_DTTM TIMESTAMP(6),

SYS_LOAD_BATCH_NUM INTEGER,

SYS_LOAD_DF_NAME VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ('Sales_Trans_Line_From_STA_To_STA_C#POS_EAN','Sales_Trans_Line_From_STA_To_STA_C#POS_EAN#POS_EAN','Sales_Trans_Line_From_STA_To_STA_C','Sales_Trans_Line_HISTORY_From_STA_To_STA_C_3#POS_EAN','Sales_Trans_Line_HISTORY_From_STA_To_STA_C#POS_EAN','Sales_Trans_Line_From_STA_To_STA_C_3','Sales_Trans_Line_HISTORY_From_STA_To_STA_C_3','Sales_Trans_Line_HISTORY_From_STA_To_STA_C','SP_LOAD_SEED_TABLES'),

SYS_LOAD_WF_NAME VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ('Sales_Trans_Line_HISTORY_To_COR','Sales_Trans_Line_To_COR','Sales_Trans_Line_HISTORY_To_COR_KIW2011','SP_LOAD_SEED_TABLES'),

SYS_LOAD_JOB_NAME VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ('Sales_And_Payment_From_Hist_To_COR_Job','Sales_And_Payment_From_ARC_To_COR_Job','Ad_Hoc_Job','SP_LOAD_SEED_TABLES'),

CONSTRAINT PK_SALES_TRANS_LINE PRIMARY KEY ( SK_SALES_TRANS_LINE_ID ))

PRIMARY INDEX ( SK_SALES_TRANS_LINE_ID )

PARTITION BY RANGE_N(SK_TIME_DAY_ID BETWEEN 20100101 AND 20100131 EACH 1 ,

20100201 AND 20100228 EACH 1 ,

20100301 AND 20100331 EACH 1 ,

20100401 AND 20100430 EACH 1 ,

20100501 AND 20100531 EACH 1 ,

20100601 AND 20100630 EACH 1 ,

20100701 AND 20100731 EACH 1 ,

20100801 AND 20100831 EACH 1 ,

20100901 AND 20100930 EACH 1 ,

20101001 AND 20101031 EACH 1 ,

20101101 AND 20101130 EACH 1 ,

20101201 AND 20101231 EACH 1 ,

20110101 AND 20110131 EACH 1 ,

20110201 AND 20110228 EACH 1 ,

20110301 AND 20110331 EACH 1 ,

20110401 AND 20110430 EACH 1 ,

20110501 AND 20110531 EACH 1 ,

20110601 AND 20110630 EACH 1 ,

20110701 AND 20110731 EACH 1 ,

20110801 AND 20110831 EACH 1 ,

20110901 AND 20110930 EACH 1 ,

20111001 AND 20111031 EACH 1 ,

20111101 AND 20111130 EACH 1 ,

20111201 AND 20111231 EACH 1 ,

20120101 AND 20120131 EACH 1 ,

20120201 AND 20120229 EACH 1 ,

20120301 AND 20120331 EACH 1 ,

20120401 AND 20120430 EACH 1 ,

20120501 AND 20120531 EACH 1 ,

20120601 AND 20120630 EACH 1 ,

20120701 AND 20120731 EACH 1 ,

20120801 AND 20120831 EACH 1 ,

20120901 AND 20120930 EACH 1 ,

20121001 AND 20121031 EACH 1 ,

20121101 AND 20121130 EACH 1 ,

20121201 AND 20121231 EACH 1 ,

20130101 AND 20130131 EACH 1 ,

20130201 AND 20130228 EACH 1 ,

20130301 AND 20130331 EACH 1 ,

20130401 AND 20130430 EACH 1 ,

20130501 AND 20130531 EACH 1 ,

20130601 AND 20130630 EACH 1 ,

20130701 AND 20130731 EACH 1 ,

20130801 AND 20130831 EACH 1 ,

20130901 AND 20130930 EACH 1 ,

20131001 AND 20131031 EACH 1 ,

20131101 AND 20131130 EACH 1 ,

20131201 AND 20131231 EACH 1 ,

20140101 AND 20140131 EACH 1 ,

20140201 AND 20140228 EACH 1 ,

20140301 AND 20140331 EACH 1 ,

20140401 AND 20140430 EACH 1 ,

20140501 AND 20140531 EACH 1 ,

20140601 AND 20140630 EACH 1 ,

20140701 AND 20140731 EACH 1 ,

20140801 AND 20140831 EACH 1 ,

20140901 AND 20140930 EACH 1 ,

20141001 AND 20141031 EACH 1 ,

20141101 AND 20141130 EACH 1 ,

20141201 AND 20141231 EACH 1 ,

20150101 AND 20150131 EACH 1 ,

20150201 AND 20150228 EACH 1 ,

20150301 AND 20150331 EACH 1 ,

20150401 AND 20150430 EACH 1 ,

20150501 AND 20150531 EACH 1 ,

20150601 AND 20150630 EACH 1 ,

20150701 AND 20150731 EACH 1 ,

20150801 AND 20150831 EACH 1 ,

20150901 AND 20150930 EACH 1 ,

20151001 AND 20151031 EACH 1 ,

20151101 AND 20151130 EACH 1 ,

20151201 AND 20151231 EACH 1 ,

NO RANGE, UNKNOWN);

The table contains data from 1st JAN 2010 to till date and as per our need we have to maintain data from 2010.

The problem is when i am joining this big table with other tables, it always consumes hell lot of time. Because it always not possible to join based on the primary index and use the partition. So the queries normally apply few filter condition on this table and takes out huge data into spool and performs operation.

  Please suggest anyway to improve the performance of the query which uses this table.

1 REPLY
Teradata Employee

Re: Query on Big Table sized approx 1 TB

First of all, it needs to be analyzed before concluding what needs to be done.

Initial thoughts drag me to ask .... if stats are collected?

The main idea is to minimize two things .... one, replication in general .... and two, replication of a bigger table.

Can you share the columns most frequently used, stats collected or not, joining tables replicated or not, .... etc?