Ok, so I shouldn’t even need to broach this topic, as I’m sure you have all heard it before: compress, compress, and compress.
However, for some reason, some sites are still not getting it, or, they have not adequately incorporated this pertinent step in their development process, leading to tables still being implemented without adequate compression. Perhaps, the tables were compressed a long time ago, but are not taking advantage of the advances in compression. like multi-value.
Compression not only decreases your storage space (anyone billing for that?) but also decreases the amount of IO that needs to be performed in retrieval and spooling queried data. For smaller tables, it can also increase the number of tables cached in memory, eliminating subsequent IO’s for those tables.
This query takes a quick look at the type of compression applied to each table, pointing to opportunities for further compression. I am looking at table 10GB in size or greater...adjust to your site's needs:
MAX(CASE WHEN (compressvaluelist IS NOT NULL)
THEN (CASE WHEN INDEX(compressvaluelist,',') > 0
THEN '3. MVC '
ELSE '2. SVC '
ELSE '1. NONE'
FROM dbc.columns dbt,
FROM DBC.Tables t,
WHERE t.DATABASENAME = ts.DATABASENAME
AND t.TABLENAME = ts.TABLENAME
AND ts.TABLENAME <> 'ALL'
HAVING CURRENT_PERM > 10000000000
GROUP BY 1,2) pds
WHERE dbt.DATABASENAME IN ('AAAA','BBBB','CCCC')
AND dbt.DATABASENAME = pds.DATABASENAME
AND dbt.TABLENAME = pds.TABLENAME
-- HAVING COMPRESS_TYPE = '1. NONE'
GROUP BY 1,2
ORDER BY 1,3, 4 DESC,2
Here are some sample results from the query:
The query results simply give me a feel for what level of compression has been applied. NONE is bad, SVC (Single Value Compression) is better, but MVC (Multi-Value Compression) is where we really where we want to be at if possible.
The results are ordered within database by the size of the table and type of compression. Largest tables with NONE (no compression, followed by largest tables with SVC compression, etc. This should help you to quickly target the large outliers.
BTW, if you are just starting to compress, or redoing your compression, I propose this methodology:
Identify the largest tables in each database, as well as all the frequently used smaller common reference tables.
Start working down each database hitting the largest tables. For example take the top 5 from database a, b and c. Compress, implement, then move to next largest tables.
Simultaneously, work through as many of the common reference tables as you can as quickly as you can; since most of these will be your most widely used tables across applications, and will therefore benefit as wide a user base as possible. (Remember, frequency of use equates to more IO, so making these tables smaller and potentially cacheable can have a large impact on overall throughput).
Try it at your site and see what you get. If you can, please share some feedback on your results with the rest of us.
As always, comments or questions are welcome.