Issue with table skew being high even with unique PI

Database
Enthusiast

Issue with table skew being high even with unique PI

I have a MULTISET table with a PI column which is of INTEGER datatype (single column PI only). This column has system generated values that are unique (like a surrogate key) (using ROW_NUMBER() OVER (ORDER bY 1)).  With the PI column value being unique, I still note the table to have high skew value of 74% with the table having a  row count of 20 million rows .

I understand that having unique PI will distribute the tables rows evenly across all AMPs and the more unique the PI value is the skew will be better ( < 1) . With generated surrogate key also being unique, why is it that the table skew is still high. Please share your thoughts around this.

3 REPLIES
Enthusiast

Re: Issue with table skew being high even with unique PI

Hello,

To be honest you manage to confused me .

You mentioned system generated values but then you say that you use row_number()......

You mentioned that the table is Multiset but you have not mentioned if you have a UPI or PI ?

Can you provide us the result of

sel pi_clm,count(*) from table

group by 1

having count(*) >1;

Then , can you post the number of the Amp's ? SELECT HASHAMP()+1 ;

and the sel count(*) from table;

Thanks.

Enthusiast

Re: Issue with table skew being high even with unique PI

It is a MULTISET table with PI. We generated the surrogate key with row_number() OVER (ORDER BY 1).

Also the select pi_column, count(*) always gives 1 as the count for each PI value and there were none with count(*) > 1.

Junior Contributor

Re: Issue with table skew being high even with unique PI

How did you calculate the skew, based on HASHAMP(HASHBUCKET(HASHROW(PI))) or dbc.TableSizeV?

Can you show the actual query?