Compression (MVC): what happens when too many distinct values are inserted

Database
Enthusiast

Compression (MVC): what happens when too many distinct values are inserted

If I define COMPRESS (1,2,3,4,5) on a column in Teradata and at some point the data changes in an unforseen manner resulting in e.g.300 different values being inserted during future loads - what happens at that point in time? Will the load fail? Will the compression be disabled, or just applied partially?

 

What with the value 6 being inserted? Same story?

 

Teradata version is 13.10

 

thanks for the help!


Accepted Solutions
Senior Apprentice

Re: Compression (MVC): what happens when too many distinct values are inserted

Hi,

 

No, I wouldn't say that.

 

If you specify COMPRESS by itself then the database will only compress NULLs. No non-null data values will be compressed.

 

With MVC you have to specify the non-null data values that you want to be compressed. As soon as you compress anything then NULLs will also be compressed.

 

Typically you would specify the most commonly occurring data values in your COMPRESS list. Other values will not be compressed.

 

Over time you might re-visit the list of values, particularly if the data demographics have significantly changed for that column.

 

A lot of the time you're using MVC on code columns (status_code, order_code etc.) or maybe a date column but only compressing certain application specific values (9999-12-31 comes to mind). You mght also compress columns used ot join to reference tables, because those tend to have very few values (perhaps think of a 'sales region identifier' in a sales table).

 

Different industries have different columns that commonly can be compressed effectively.

 

Cheers,

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
1 ACCEPTED SOLUTION
6 REPLIES
Senior Apprentice

Re: Compression (MVC): what happens when too many distinct values are inserted

Hi,

 

Any values not in your list will be stored (they won't fail) but they will not be compressed.

 

Cheers,

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Enthusiast

Re: Compression (MVC): what happens when too many distinct values are inserted

So that means that the rest of the rows will still benefit from compression?

And this goes in both cases?

COMPRESS (1,2,3,4,5)  ==> a value 6 is inserted, all the others are still compressed

 

COMPRESS ==> more than 255 different values are inserted ==> the first 255 values are still compressed

 

?

 

Thanks!

Senior Apprentice

Re: Compression (MVC): what happens when too many distinct values are inserted

Yes. Even if you only compress on values 1,2,3,4,5 when you insert value 6 it does not get compressed but values 1,2,3,4,5 are still compressed.

 

It doesn't matter how many values are 'not compressed' all the values in your compression list will be compressed and will stay compressed.

 

The only time a compressed value will be uncompressed in the data row is if you remove it from the compress list.

 

Does that help?

 

Cheers

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Enthusiast

Re: Compression (MVC): what happens when too many distinct values are inserted

yes, this helped a lot already, just a final question...

 

so from this I gather that it is in most cases better to just specify COMPRESS instead of COMPRESS(1,2,3,4,5) unless -REALLY- sure that no further values will occur?

Senior Apprentice

Re: Compression (MVC): what happens when too many distinct values are inserted

Hi,

 

No, I wouldn't say that.

 

If you specify COMPRESS by itself then the database will only compress NULLs. No non-null data values will be compressed.

 

With MVC you have to specify the non-null data values that you want to be compressed. As soon as you compress anything then NULLs will also be compressed.

 

Typically you would specify the most commonly occurring data values in your COMPRESS list. Other values will not be compressed.

 

Over time you might re-visit the list of values, particularly if the data demographics have significantly changed for that column.

 

A lot of the time you're using MVC on code columns (status_code, order_code etc.) or maybe a date column but only compressing certain application specific values (9999-12-31 comes to mind). You mght also compress columns used ot join to reference tables, because those tend to have very few values (perhaps think of a 'sales region identifier' in a sales table).

 

Different industries have different columns that commonly can be compressed effectively.

 

Cheers,

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Enthusiast

Re: Compression (MVC): what happens when too many distinct values are inserted

A-ha! This is very relevant information, I was assuming a different logic for COMPRESS the whole day. You really helped me out a lot. Thanks!