High Skewness

Database
Enthusiast

High Skewness

Hi

Im trying to execute this query ,but it is being punished by the workload every time with high skewness

INSERT INTO sample_tbl

SEL

* FROM product_region_1

WHERE (prod_cd,ord_date,prod_name)

NOT IN (SELECT prod_cd,ord_date,prod_name FROM product_region_2)

collected stats on  product_region_1 for columns  prod_cd,ord_date,prod_name .

but is not executing ,it having high skew .

i had verified the skew on both the tables are 1.63%

any one can suggest me on the same how to avoid skew . Its quite urgent any one please help me

14 REPLIES
Teradata Employee

Re: High Skewness

Can you attach DDL for the two tables?

Enthusiast

Re: High Skewness

any one please comment its urgent please help me

Enthusiast

Re: High Skewness

CREATE TABLE sample_tbl

(

ord_date DATE FORMAT 'yyyy-mm-dd',

ord_TIME TIME(0),

cust_num INTEGER,

contact VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,

prod_code VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,

quantity INTEGER

)PRIMARY INDEX(ord_date,ord_TIME,cust_num,contact);

CREATE TABLE product_region_1

(

ord_date DATE FORMAT 'yyyy-mm-dd',

ord_TIME TIME(0),

cust_num INTEGER,

contact VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,

prod_code VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,

quantity INTEGER

)PRIMARY INDEX(ord_date,ord_TIME,cust_num,contact,quantity);

CREATE TABLE product_region_2

(

ord_date DATE FORMAT 'yyyy-mm-dd',

ord_TIME TIME(0),

cust_num INTEGER,

contact VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,

prod_code VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,

quantity INTEGER

)PRIMARY INDEX(ord_date,ord_TIME,cust_num,contact);

Enthusiast

Re: High Skewness

changing the query

INSERT INTO sample_tbl

SEL

* FROM product_region_1

WHERE (prod_code,ord_date,contact)

NOT IN (SELECT prod_code,ord_date,contact FROM product_region_2)

Enthusiast

Re: High Skewness

Those are the tables and the query .

Enthusiast

Re: High Skewness

@Woddy : Would you please comment

Enthusiast

Re: High Skewness

DDLs do not seem to have Prod Name. Please post DDLs again.

ALso if you have Contact ,I don't think you require Cust_name in primary index. Try and make primary index same as join columns and change query to below .

Collect stats on product_region_1 A column  (prod_cd,ord_date,prod_name) ;

Collect stats on product_region_2 A column  (prod_cd,ord_date,prod_name) ;

INSERT INTO sample_tbl

SEL A.* FROM product_region_1 A

LEFT OUTER JOIN product_region_2

ON

A.PROD_CD=B.PROD_CD

AND A.ORD_DATE=B.ORD_DATE

AND A.PROD_NAME=B.PROD_NAME

where

B.prod_cd is null

;

Also try and just xecute the select clause.If that works fine ,Issue might be with index of Sample_tnl.

Enthusiast

Re: High Skewness

^ you will have to add coalesce if you are expecting nulls for any of the join columns and may be change where clause accordingly.

Re: High Skewness

Sunny,

Try adding the column 'quantity' on the PI of the table u r inserting into - 'sample_tbl'.

The only reason I can think for the high skew is that the rows that you are selecting from 'product_region_1' that are not on 'product-region_2' have varying number of duplicates on (prod_code,ord_date,contact).

The 'product_region_1' table has the column 'quantity' in its PI and thus may not be skewed. But the 'sample_tbl' is missing that in its PI.