Multi-Value compression has increased the table size

Database
Enthusiast

Multi-Value compression has increased the table size

Hi ,

i have a table with 4 columns like below:

create set table T1 ( col1 decimal(10,0) NOTNULL, col2 char(5) NOTNULL,

                      col3 decimal(10) notnull , col4 Date FORMAT 'yyyy-mm-dd' default date

UNIQUE PRIMARY INDEX p1 (col1,col2,col3);

row count : 3.5 billion, tablesize (before compression): 100GB

so i have compressed this table with multi-value compression on column col4 (Date) datatype, which has numerous number of duplicate value dates.

but the problem here is after compression is completed the table size has increased to 120GB ,instead of reducing the size, i feel it as unusual ..

can anyone help me why the tablesize has increased after the compression?

your soon response is appriciated..

Thank you..

19 REPLIES
Junior Contributor

Re: Multi-Value compression has increased the table size

According to you table definition the size of an uncompressed record would be 40 byte, so 3,5 billion rows calculate to 140GB.

Dieter

Enthusiast

Re: Multi-Value compression has increased the table size

Hi,

I think the row length would be 8B+5B+8B+4B=25 B. B denoted Byte.

DECIMAL(n,m) columns are stored by the Teradata database in 1, 2, 4, 8, or 16 bytes, depending on the value of n:

Number of Digits Number of Bytes
1 to 2 1
3 to 4 2
5 to 9 4
10 to 18 8
19 to 38 16

Date stores as integer so size is 4byte.

Please correct me if i am wrong.

Regards,

AS

Junior Contributor

Re: Multi-Value compression has increased the table size

25 Bytes of data + 2 Byte record length + 8 Byte ROWID + 2 Presence Bytes + 2 Bytes in the pointer array = 39 Bytes

Rows are stored byte-aligned: 40 Bytes.

Dieter

Enthusiast

Re: Multi-Value compression has increased the table size

Hi ,

Thank you for your efforts, but still i think i didn't get the right answer..

my question was.... why the table size is increased aftre compression is performed, instead it's size should redused..

any idea why?

Enthusiast

Re: Multi-Value compression has increased the table size

Hey Sreedhar,

As Dieter said. The actual size of table was 150 GB so the table size has reduced to 120 GB after compression.

Thanks

Manik

Enthusiast

Re: Multi-Value compression has increased the table size


hi ,



No, the table size was 100GB before compression, but its grown to 120 GB after compression.



i used the below query to find table size before and aftre compression.



SELECT DatabaseName,tablename,



SUM(CurrentPerm)/1024/1024 AS Tablesize_mb



,SUM(PeakPerm)/1024/1024 AS Peak



FROM DBC.tablesize



WHERE DatabaseName IN (dbname') and tablename in('tablename')



GROUP BY 1,2



ORDER BY 1,2



;


Enthusiast

Re: Multi-Value compression has increased the table size


Sree,



Compression although significantly reduces the size of the table but beyond a threshold value the space savings are negated as Both the Row  header and Table header increases in size.



 



Thats depends upon the no of valus being compressed. You can go in detail by going through chapter 15 Database Level capacity planning of the TD 12 Database Design book. which demonstrates as how the Presence bits are added to the Row header as you go on compressing more value in the record.



 



Just for quick reference below is the text:



 



Consider the following example. Suppose you have a table with these row characteristics:

• 14-byte row header

• 4-byte nullable nonunique primary index column

• 2-byte nullable SMALLINT non-index data column

The total number of bytes for this row is 20.

Because the primary index and SMALLINT columns are both nullable in this scenario, each

uses a null presence bit in the row header, so 5 unused presence bits remain in the default

presence octet.

Suppose you decide to compress 63 distinct values in the SMALLINT column. This requires

an additional 6 presence bits (see “Presence Bits” on page 864 and “Number of Presence Bits

Required to Represent Compressed Values” on page 868), rolling over into a new presence bits

octet. The row header is now 15 bytes wide, where it was previously only 14 bytes, but when a

row contains a compressed value for the SMALLINT column, it is 15 + 4 = 19 bytes wide, an

apparent savings of 1 byte for each such row in the table.

Upon further analysis, you realize that all the rows you thought were 19 bytes wide are actually

20 bytes wide, so no savings are accrued by compression. The reason the rows expanded from

19 to 20 bytes is the system-enforced even-byte row alignment: the system added a 20th filler

byte to the row to ensure an even offset.

Suppose that instead of compressing 63 distinct values in the SMALLINT column, you

compress only 31. In this case, there is no need to roll over to a second presence octet, so many

rows compress to 18 bytes.

You could also make the primary index non-nullable (the recommended practice anyway),

which also removes the need to roll over to a second presence octet. In this case, all rows can

compress to 18 bytes.



 



 



Manik


Enthusiast

Re: Multi-Value compression has increased the table size

Thank you so much for your input Manik,

i have one more doubt, Compression also improves the performance right? in this case (my above case where the size is increased after compression) does this applies?..

i.e.. is the perfomance gain will be achived for the compressed table even though the size is increased?

Enthusiast

Re: Multi-Value compression has increased the table size

Compression is primarily for space reduction and it has a performance upgrade also because more rows can fit into a Data Block and fewer data blocks are read to read more rows.

But if Compression is adding space then I don't think there is any benefit.

I would rather not compress the values beyond the threshold where it actually starts adding space.

Thanks

Manik