CHAR-COMPRESS vs VARCHAR datatype

Database
Enthusiast

CHAR-COMPRESS vs VARCHAR datatype

Hi,

I am trying to do a comparison between CHAR-COMPRESS and VARCHAR and I found the below information from Teradata Whitepaper.

- VARCHAR will be more efficient when the difference of maximum and average field length is high and compressibility is low.
- Compression and fixed CHAR will be more efficient when the difference of maximum and average field length is low and compressibility is high.

I also tried to work-out a small examplde on this :

CREATE MULTISET TABLE KCHANDR.test_kou1 ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
src_sys_nm CHAR(20) COMPRESS ('SAP'),
ord_nbr INTEGER
)
PRIMARY INDEX ( ord_nbr );

CREATE MULTISET TABLE KCHANDR.test_kou2 ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
src_sys_nm VARCHAR(20) ,
ord_nbr INTEGER
)
PRIMARY INDEX ( ord_nbr );

insert into test_kou1 ('SAP',191008);
insert into test_kou1 ('SAP',191918);
insert into test_kou1 ('SAP',178918);
insert into test_kou1 ('SAP',191118);

insert into test_kou2 ('SAP',191008);
insert into test_kou2 ('SAP',191918);
insert into test_kou2 ('SAP',178918);
insert into test_kou2 ('SAP',191118);

COLLECT STATISTICS ON kchandr.test_kou1 COLUMN src_sys_nm;
COLLECT STATISTICS ON kchandr.test_kou1 COLUMN ord_nbr;

COLLECT STATISTICS ON kchandr.test_kou2 COLUMN src_sys_nm;
COLLECT STATISTICS ON kchandr.test_kou2 COLUMN ord_nbr;

select sum(currentperm) from dbc.allspace where lower(databasename) = 'kchandr' and lower(tablename) = 'test_kou1';
Return Value is => 462848.00

select sum(currentperm) from dbc.allspace where lower(databasename) = 'kchandr' and lower(tablename) = 'test_kou2';
Return Value is => 462848.00

Now, my question is that I have defined the column "src_sys_nm" as CHAR-COMPRESS in "test_kou1" table and also used only the value 'SAP' which is mentioned in the table defination, then why not the tablesize appears to be smaller for "test_kou1" comapare to "test_kou2".

Regards,
Koushik
2 REPLIES
Junior Contributor

Re: CHAR-COMPRESS vs VARCHAR datatype

Hi Koushik,
COMPRESS is only usefull for large tables, i wouldn't call 4 rows "large" :-)

For each table there's a table header stored on each AMP, check the size of the empty table.
After inserting 4 rows that size will be increased by probably 2048 bytes:
Those 4 rows will be stored on 4 different AMPs using a datablock of 512 bytes, which is the minimum size. When you save 20 bytes per row this will not change that size.

Insert a few million rows and you'll see the difference.

Dieter

Re: CHAR-COMPRESS vs VARCHAR datatype

hi koushik,
i am new to teradata, i have to do certifications i know the certification process but i have know idea about how to prepare for that. can u suggest what r the topics i have to prepare for teradata 12 basic exam and where will i get the materials for TD 12 basic exam. if u have please send me to baskar.bommireddy@gmail.com

Thanks in advance