Hi All, I need to compress space in a character column..
My question is do i need to write 4 types of compress as in..
CREATE TABLE COMP ( A CHAR(4) COMPRESS(' ',' ',' ',' ') ,b ,c ) i.e compressing 1 space is equal to compressing 4 spaces or do we need to write it as multivalue compression so that all four types of spaces coming in data(with 1 space,2 space upto 4 spaces) are handled?? Regards,
Compression in Teradata is value based, you supply the values you want to compress, but for the whole field. you cannot compress portions of the field, its not as good as you think there. Thus if your column A contained only a few distinct values then it would compress well, as you could specify most or all of them.
If you have the majority of the values as 4 spaces you would specify
A CHAR(4) compress(' ') # thats 4 spaces in single quotes
If the field contained 10% AAAA, 20%BBBB, 30% BBB and the rest space you would specify
A CHAR(4) COMPRESS (' ','BBB ','BBBB','AAAA')
I have been using this code to sample columns (although I am sure there are better ways to do it)
SELECT top 255 select Value, count(*) from ( select Value from table sample 500000 ) as x group by 1 order by 2 desc;
This gets me the top 255 values ( by a sample of half a million rows from a very large table) I then put them into excel and work out the relative percentages for the values. If I accumulate those percentages going down the list I usually find that the first 3 , 8 or 15 or so may give me 70% of the rows which gives me a good start in selecting the right values.