Data distribution between PI and No Index columns...

Database
l_k
Enthusiast

Data distribution between PI and No Index columns...

Hi All,

I have two tables and the DDL of those are given below:

1) Transaction table:

CREATE SET TABLE sample.T0611_ELECTRONIC_ADDRESS ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
Electronic_Address_Id DECIMAL(18,0) NOT NULL,
Electronic_Address_Txt VARCHAR(1000) CHARACTER SET LATIN NOT CASESPECIFIC,
Electronic_Address_Type_Cd CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ,
Electronic_Address_Domain_Name VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
Domain_Root_Cd CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ,
Start_Dt DATE FORMAT 'YYYY-MM-DD' COMPRESS ,
End_Dt DATE FORMAT 'YYYY-MM-DD' COMPRESS ,
Data_Source_Cd CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC COMPRESS ,
Record_Deleted_Flag BYTEINT NOT NULL DEFAULT 0 ,
BusinessDate DATE FORMAT 'YYYY-MM-DD' COMPRESS ,
Ins_Txf_BatchID INTEGER COMPRESS ,
Upd_Txf_BatchID INTEGER COMPRESS )
PRIMARY INDEX ( Electronic_Address_Id );

2) Dim(master) table:

CREATE SET TABLE sample.DATA_SOURCES ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
S_NO INTEGER,
DATA_SOURCE_CD CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
DESCRIPTION VARCHAR(255) CHARACTER SET LATIN NOT CASESPECIFIC,
START_DT DATE FORMAT 'YY/MM/DD',
END_DT DATE FORMAT 'YY/MM/DD')
PRIMARY INDEX ( DATA_SOURCE_CD );

The problem is,In all the INNER JOIN ,we are using the following condition :

ON Coalesce(TGT.DATA_SOURCE_CD,'') = Coalesce(FIL.DATA_SOURCE_CD,'')
WHERE TGT.End_Dt IS NULL

1) In the transaction table,DATA_SOURCE_CD is not an index column and in the DIM table DATA_SOURCE_CD is a PI.
2) Mostly only NULL values have been compressed in the transaction table for the column DATA_SOURCE_CD
3) In the above WHERE clause condition,END_dt value is null for all the rows in the transaction table.

I need the following clarifications to optimize all the queires when we use the above INNER join condition.

1) If i introduce the column DATA_SOURCE_CD as a PI in the transaction table,whether the performance will be improved or not?
2) There is NO null value in the DATA_SOURCE_CD in the transaction table.So,shall i remove the coalesce condition in the
INNER Join condition? (if i remove coalesce,whether performance will improve?)
3) END_DT is in WHERE clause.If i create any INDEX will improve the performance?

If you have any other suggestions/thoughts also are welcome..