How to bring un-utilized space for utilization in terdata

Database
Enthusiast

How to bring un-utilized space for utilization in terdata

Hi,

I am going through some of the topics on space utilization and got confused. Can you please clarify the below points:

1. I have calculated

SUM(CurrentPerm)/1024/1024/1024 &

MAX(CurrentPerm) * (HASHAMP()+1)

for a particular database in DBC.DISKSPACE. The outputs are 1084 and 1.25 respectively.

Does this mean lots of space (app 1083 Gb) got unutilized due to skewness?

Can we bring this un-utilized space back to utilization?

2.What is the role of temp space and spool space in deciding total utilization? For example, If i do not define collect stats for any particular table, will it impact the space utilization while running some scripts over that table? Will defining stats any ways improve space utilization?

3. Among SET & MULTISET tables, which table is better in terms of space utilization? In what situations it is suggestible to create SET table over MULTISET. Will defining SET table with UPI is a ideal combination in terms of space utilization or performance? or UPI is not required to define, once a table is declared as SET?

Appreicate if some clarify these points.

Thanks, Prasanth.

9 REPLIES
Teradata Employee

Re: How to bring un-utilized space for utilization in terdata

1. Assuming both are scaled the same (/1024/1024/1024) then yes, this would indicate very high skew and lots of space you may be unable to utilize. You could look at DBC.TableSizeV to identify offending tables, and find a better PI for those.

2. Temp and Spool use left-over space - whatever isn't CurrentPerm. Stats have no impact on space.

3. SET vs MULTISET has no impact on space (unless the MULTISET table actually has lots of duplicate rows). UPI also has no effect on space, though it makes duplicate checking trivial, which can be a performance benefit on INSERT / UPDATE - but there are tradeoffs. It could hurt SELECT performance if that means the PI is no longer used for access or large redistributions are needed for common joins.

Enthusiast

Re: How to bring un-utilized space for utilization in terdata

Hi Fred,

Thanks for your inputs. That helped alot. can you please update me below points too:

1. In what scenarios defining a table as SET is better than defining it as a MULTISET?

2.Can we find out the list of tables for which collect stats have been defined in a particular DB? I need to trace out the tables with no stats and define some. Do you have some query for this task?

Prasanth.

Enthusiast

Re: How to bring un-utilized space for utilization in terdata

Fred,

3rd question

1. I have calculated

SUM(CurrentPerm)/1024/1024/1024 &

MAX(CurrentPerm) * (HASHAMP()+1)

for a particular database in DBC.DISKSPACE. The outputs are 1084 and 1.25 respectively.

Does this mean lots of space (app 1083 Gb) got unutilized due to skewness?

Can we bring this un-utilized space back to utilization?

Regarding this query, If a am querying this info on  stage table (Truncate and load), then will this wastage due to skew even continue when the next instance runs. Please clarify.

Teradata Employee

Re: How to bring un-utilized space for utilization in terdata

It's not possible for MAX(CurrentPerm)*(HASHAMP()+1) to be 1.25, nor is it possible for SUM to be 1.25GB if MAX*#AMPs is only 1084 bytes. So the amount of unutilized space is still unclear to me.

When you truncate (DELETE all rows) the space will be released and available for use by other tables. But if you keep the PI unchanged and load similar data, then you will once again have the skew.

Teradata Employee

Re: How to bring un-utilized space for utilization in terdata

Regarding Stats: In TD14.0 and later you can query DBC.StatsV. You can also use the combination of DBC.ColumnStatsV, DBC.IndexStatsV, and DBC.MultiColumnStatsV which exist back to TD12.0

SET tables should be used if you want the database to prevent entirely duplicate rows from being stored. "Prevent" includes generating an error for a single-row insert that would be a duplicate, and quietly eliminating duplicates for an INSERT/SELECT.

Enthusiast

Re: How to bring un-utilized space for utilization in terdata

Hi Fred,

Thanks for posts. They are really useful. One final question here-

what is the solution for improving skewness for a existing table with data?

1.If a create a new table with different PI (to improve SKEW) and try to insert data, performance should be a probelm right..? as PI for both the tables are different. and any joins with previous PI columns are required to be changed right..? PLease confirm.

2.Is creating some secondary index a simple solution for this? will this work?

Let me know how this problem can be solved.

Teradata Employee

Re: How to bring un-utilized space for utilization in terdata

1) Creating a new table with different PI does have performance tradeoffs but is normally what you must do to fix extreme skew. (In special cases, columnar storage or NoPI tables might be of some use.)

2) A secondary index or join index matching the old PI may in some cases mitigate the performance impact. Or changing the PI alone might actually improve performance even without a SI / JI. It all depends on how the data is accessed.

Re: How to bring un-utilized space for utilization in terdata

Hi,

How feasible is the concept of utilizing user's local machine's virtual memory as a spool space instead of teradata's own spool space form Db, as and when required?

Junior Contributor

Re: How to bring un-utilized space for utilization in terdata

Spool is only on the server side.