Teradata compression

Database
Enthusiast

Teradata compression

Hi,

If I want to compress columns on an table which has an associated view, will there be any impact(s) on the view after compressing the corresponding table?

Also, are there any disadvantages or performance issues that you know of regarding compression.

Thanks.
5 REPLIES
Enthusiast

Re: Teradata compression

There will be an impact, could be faster or slower. The scale of the impact will range from negligible to significant, because somewhere the system will have to uncompress the data. The CPU used in decompressing can be offset by the increased row throughput off of the AMPS.

I do not know how your view is defined, does it join tables on Primary Index? Do your users query the view and predicate on a column that is compressed and not indexed, this may be slower, than previously.

Not much use as an answer, but my advice would be to test with real user load. Use the DBQL tables and extract any query that has used that table in the last couple of months. Then work out the Average and Standard deviations for the CPUtime and Elapsed time for those queries. If you can break it down over the day/hour so much the better. This gives you a baseline. Create a copy of the table initally and compress as you see fit, then run example queries from the list you devived earlier using the compressed table in place of the old one. Again extract the Averages etc, and work out the difference. If you are happy with this, run the compression on the existing table, and let your users use it. Continue to monitor. After a couple of weeks you should have a picture.

This may seem long winded. However as with all technology "improvements" you need to test objectively. Running the odd query in isolation, could lead you to conclude that it was faster/Slower based on your initial "gut" feeling, or pressure applied on you from elsewhere (Project Managers, Users)

Compression on any database technology is a compromise between CPU and IO performance.

Random
Enthusiast

Re: Teradata compression

Thanks for the response, is there any kind of calculation that has to be done to determine the maximum number of columns that can be compressed in a table so as to not affect the performance to a greater extent.
Enthusiast

Re: Teradata compression

I don't know if there are any rules of thumb, but the Performance Management Manual has *some* information.

It implies to that for maximum performance benefit with compression(Value-list compression) then it requires more memory, so you perhaps need to manage FSG Cache and increase the amount of memory available to AMP, but this reduces the amount available to Cache and may affect the cache hit ratio thus increasing physical reads and reducing performance.

In general the Teradata manuals imply that overall SQL SELECT performance in increased, but INSERT/UPADTE/DELETE processing may be slower. I have no such opinion but I would test as described before, I feel there are too many variables at play to give a definitive answer. It would be possible to see improvements on a test system, but would the improvements transpose to a Live system with many users, running their badly coded SQL against your database, alongside your batch processing etc?

I would guess that compression being based on dictionary type methods require the dictionary to be stored in memory so the scale of compression may alter, ie some things compress better than others, and may require less CPU to uncompress.

Sorry for rambling along, but I think you should proceed with due dilligence and test and document etc.

The Perfromance management manual contains an interesting statement

"The presence of compression has not been shown to degrade query response time since the
uncompressed values are held in the table header in memory and can be accessed very quickly.
However, with more data able to be read per data block due to the smaller row sizes, it is
possible for some queries to become more CPU-intensive when using compression
extensively."

Hope this helps?
Enthusiast

Re: Teradata compression

Let us consider 2 tables; both the tables have 2 column names in common, but after compression, one table yields a significant difference in these columns sizes whereas the other does not. Can anyone tell me why this is happening?
Thanks.
Enthusiast

Re: Teradata compression

Do the columns from the different tables have the same data? Did you copy one table to another to test compression?

Or are the tables Parent and Child? with the column being a Primary/foreign key?

Teradata compression is based on George Kingsley Zipf linguistic methods of word frequency.

Teradata allows 255 different values per column to become compressed. It basically creates a dictionary, or each value and maps to a compressed value.

Differences could occur because of the frequency of the value compressed, or the compression value taken.

Random.