i have a table with 4 columns like below:
create set table T1 ( col1 decimal(10,0) NOTNULL, col2 char(5) NOTNULL,
col3 decimal(10) notnull , col4 Date FORMAT 'yyyy-mm-dd' default date
UNIQUE PRIMARY INDEX p1 (col1,col2,col3);
row count : 3.5 billion, tablesize (before compression): 100GB
so i have compressed this table with multi-value compression on column col4 (Date) datatype, which has numerous number of duplicate value dates.
but the problem here is after compression is completed the table size has increased to 120GB ,instead of reducing the size, i feel it as unusual ..
can anyone help me why the tablesize has increased after the compression?
your soon response is appriciated..
According to you table definition the size of an uncompressed record would be 40 byte, so 3,5 billion rows calculate to 140GB.
I think the row length would be 8B+5B+8B+4B=25 B. B denoted Byte.
DECIMAL(n,m) columns are stored by the Teradata database in 1, 2, 4, 8, or 16 bytes, depending on the value of
|Number of Digits||Number of Bytes|
|1 to 2||1|
|3 to 4||2|
|5 to 9||4|
|10 to 18||8|
|19 to 38||16|
Date stores as integer so size is 4byte.
Please correct me if i am wrong.
25 Bytes of data + 2 Byte record length + 8 Byte ROWID + 2 Presence Bytes + 2 Bytes in the pointer array = 39 Bytes
Rows are stored byte-aligned: 40 Bytes.
Thank you for your efforts, but still i think i didn't get the right answer..
my question was.... why the table size is increased aftre compression is performed, instead it's size should redused..
any idea why?
As Dieter said. The actual size of table was 150 GB so the table size has reduced to 120 GB after compression.
No, the table size was 100GB before compression, but its grown to 120 GB after compression.
i used the below query to find table size before and aftre compression.
SUM(CurrentPerm)/1024/1024 AS Tablesize_mb
,SUM(PeakPerm)/1024/1024 AS Peak
WHERE DatabaseName IN (dbname') and tablename in('tablename')
GROUP BY 1,2
ORDER BY 1,2
Compression although significantly reduces the size of the table but beyond a threshold value the space savings are negated as Both the Row header and Table header increases in size.
Thats depends upon the no of valus being compressed. You can go in detail by going through chapter 15 Database Level capacity planning of the TD 12 Database Design book. which demonstrates as how the Presence bits are added to the Row header as you go on compressing more value in the record.
Just for quick reference below is the text:
Consider the following example. Suppose you have a table with these row characteristics:
• 14-byte row header
• 4-byte nullable nonunique primary index column
• 2-byte nullable SMALLINT non-index data column
The total number of bytes for this row is 20.
Because the primary index and SMALLINT columns are both nullable in this scenario, each
uses a null presence bit in the row header, so 5 unused presence bits remain in the default
Suppose you decide to compress 63 distinct values in the SMALLINT column. This requires
an additional 6 presence bits (see “Presence Bits” on page 864 and “Number of Presence Bits
Required to Represent Compressed Values” on page 868), rolling over into a new presence bits
octet. The row header is now 15 bytes wide, where it was previously only 14 bytes, but when a
row contains a compressed value for the SMALLINT column, it is 15 + 4 = 19 bytes wide, an
apparent savings of 1 byte for each such row in the table.
Upon further analysis, you realize that all the rows you thought were 19 bytes wide are actually
20 bytes wide, so no savings are accrued by compression. The reason the rows expanded from
19 to 20 bytes is the system-enforced even-byte row alignment: the system added a 20th filler
byte to the row to ensure an even offset.
Suppose that instead of compressing 63 distinct values in the SMALLINT column, you
compress only 31. In this case, there is no need to roll over to a second presence octet, so many
rows compress to 18 bytes.
You could also make the primary index non-nullable (the recommended practice anyway),
which also removes the need to roll over to a second presence octet. In this case, all rows can
compress to 18 bytes.
Thank you so much for your input Manik,
i have one more doubt, Compression also improves the performance right? in this case (my above case where the size is increased after compression) does this applies?..
i.e.. is the perfomance gain will be achived for the compressed table even though the size is increased?
Compression is primarily for space reduction and it has a performance upgrade also because more rows can fit into a Data Block and fewer data blocks are read to read more rows.
But if Compression is adding space then I don't think there is any benefit.
I would rather not compress the values beyond the threshold where it actually starts adding space.