Comments and Queries discussed at Partner's 2011

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

I was lucky enough to be able to attend Partners in San Diego last week.  As part of that, I did have an opportunity to speak a time or two and discussed a few performance issues.

I discussed swapping, SegIO Reads, and measuring I/O bandwidth but wasn't able to provide detail or queries, etc. at the time.

They are below.

Swapping, SegIO Reads and Potential Memory Issues

The following query will report on the number of swaps per second per node.  Depending on how many nodes your system has, and how frequently you have resusage logging, you may need to adjust the number of days in the WHERE clause to keep the number of rows returned down to a manageable number.

So, at this point the question always is "How many swaps per second are ok?".  And, unfortunately, the answer is "It depends".  As an extremely high level guideline, some will say 40 per second per node is where you want to take action.  In a perfect world, 0 swaps per second is the target number.  In reality though, it depends on the time of day the swapping occurs, the impact it may be having to performance (primarily tactical query performance) and your site's tolerance for this performance impact.  If, for instance, you see brief periods of swapping during your batch window and there is no performance impact, you may not want to take any action.

One thing you can do is chart the following data and compare it to your tactical query performance.  If you see that your tactical query performance is starting to degrade when you see 4 swaps per second, then for your site that may be too high.

At the point in time where you have decided swapping is an issue, then you need to consider reducing FSGCache.  You can open an incident and request assistance from the GSC, or you can talk to your CSR directly about this issue.

LOCKING dbc.resusagespma FOR ACCESS

SELECT TheDate

,TheTime (FORMAT '99:99:99')

,NodeId

,MemCtxtPageReads (FORMAT 'Z(14)9')

,MemCtxtPageReads/secs (FORMAT 'Z(14)9.9', named "pswpin/s")

,MemCtxtPageWrites (FORMAT 'Z(14)9')

,MemCtxtPageWrites/secs (FORMAT 'Z(14)9.9', named "pswpout/s")

,(MemCtxtPageReads+MemCtxtPageWrites)/secs (FORMAT 'Z(14)9.9', named "TotSwap/s")

from dbc.resusagespma

WHERE TheDate between date-7 and date

ORDER By 1,2,3;

Another memory related metric to measure is SegIOReads.  This one is more challenging to understand, and less cut and dried when trying to identify an issue.

These SegIOReads are basically memory segments that the operating system is swapping in from disk to read.  Some number of these cannot be avoided due to how these segments are defined and used.  So, the fact that you see some activity here it doesn't necessarily indicate a problem.  What we look for are three general things.  First, if we are seeing SegIO reads during the same period that you are seeing swapping, then this could be a concern.  And, when you lower FSGCache to address the swapping you should see these SegIOReads naturally reduce in quantity as well.  Secondly, we get concerned when these are very high.  In the 10's or 100's of thousands.  And finally, as in almost any case, if you can chart high SegIOReads to actual performance issues on your system, then it is something you will want to address.

One key point is that very high SegIOReads can cause the internal root disk on the node (Identified as "sda") to become very busy.  If this root disk becomes 100% busy, then this can cause severe performance issues.  In addition to the following query, you can also look at root disk busy via the iostat command on the node.  This is discussed in the next section.

If you see SegIOReads and swapping, then this would be addressed as discussed above by reducing FSGCache.  If you have concerns surrounding SegIOReads that are not tied to swapping and would like more information, you can open an incident to have it researched by the GSC.  They may want to change your SegFileSize and/or aggregate cache, for instance.

Here is the query and a sample graph.  This query looks at SegIOReads by node by Resusage collection period.

locking row for access

sel thedate, thetime, nodeid, MemTextPAgeReads from dbc.resusagespma

where thedate between date-7 and date;

Measuring I/O Bandwidth

We are all accustomed to measuring and analyzing CPU consumption on Teradata, but measuring and analyzing I/O consumption is a little more challenging and therefore less often done.  There are a couple of ways I discussed measuring I/O used against the I/O capacity of your system.

The first method is more real-time and uses the iostat command on a node.  The following command will display your IO subsystem busy percentage by device every 10 seconds until you stop it, or it runs 10,000 times.  It only reports for the node you are currently logged onto.  Ignore the very first refresh, as I believe that is showing data averaged since the node was rebooted.  Below I show just a sample of a few logical devices, but it helps us to see that the %util information is helpful.  In this data, you can see the internal root disk (sda) is not busy at all.  We can see there is some skew driving one device to approach 100% busy.

iostat -kxt 10 10000 | grep -v '0.00  0.00'

Another method to measure I/O bandwidth, looking back at resusage data.

The first step is to find your theoretical I/O bandwidth rating per node.  This information is in the choosing calculator for the configuration your local team put together when your machine was first configured.  If you don't have this information, please contact your SA and they should be able to help you find it.  Here is an example that shows that this system has a theoretical I/O bandwidth of about 1094 96K I/Os, assuming 80% reads to writes.

Now, we can query ResUsage to get an idea of the number of MB/second/node I/Os we are doing.  The following query and chart show this information.  This is not a hard and fast comparison, unfortunately.  You can't assume that if you see a number in Resusage higher than what you see for the 80% 96K Array MB/sec number above that this is a certain issue.  There are alot of factors at play.  If you are performing more reads per write, then that bandwidth number is higher as seen above in the 90% 96K Array MB/sec section.  And, vice versa.  Also, if your average I/O size is larger or smaller this affects your throughput.  But, as a general guideline it is reasonable to use the 80% number.

If you find you are pushing the limit, and that you are also having performance issues during those times, then it is time to start addressing I/O.  This can be done by looking for tuning candidates and reducing their demand for I/O, using TASM rules to alert DBAs of high consumers and/or aborting them, or you can manually put them in a penalty box with a tight CPU limit to reduce their demand for I/O.  There are many other avenues as well, such as indexing schemes, PPI, etc.

SELECT

TheDate,

TheTime,

NodeID,

/* Add Physical Read MBytes / sec and */

/* Total MBytes for Disk WRITE /sec */

((SUM(FileAcqReadKB + FilePreReadKB ) * 100 / SUM(CentiSecs) +

SUM(FileWriteKB) * 100 / SUM(CentiSecs))/1024) bandwidth

FROM DBC.ResUsageSpma where thedate between date-5 and date

GROUP BY 1,2,3;

If I promised something else while I was at Partners that isn't addressed here, or if you have any other questions or comments about anything in this blog, please comment on the blog and I'll reply as quickly as I can.

Thank you.

4 Comments
What does it mean when sda shows 90-100% util?
Teradata Employee
If running TD14.0.0.1 please use the ResUsageGeneralInfoView to access these fields (fixes an overflow issue). So the bandwidth query would be:
SELECT
TheDate,
TheTime,
NodeID,
/* Add Physical Read MBytes / sec and */
/* Total MBytes for Disk WRITE /sec */
((SUM(FileAcqReadKB + FilePreReadKB + FileWriteKB) * 100 / SUM(CentiSecs))/1024) bandwidth
FROM DBC.ResGeneralInfoView where thedate between date-5 and date
GROUP BY 1,2,3;

If using TD14.10 or later, please use the ResSpmaView. The ResGeneralInfoView is being replaced by the shorter and more clear ResSpmaView name to match the view naming convention used for all the other ResUsage tables. So the bandwidth query will then be:
SELECT
TheDate,
TheTime,
NodeID,
/* Add Physical Read MBytes / sec and */
/* Total MBytes for Disk WRITE /sec */
((SUM(FileAcqReadKB + FilePreReadKB + FileWriteKB) * 100 / SUM(CentiSecs))/1024) bandwidth
FROM DBC.ResSpmaView where thedate between date-5 and date
GROUP BY 1,2,3;

Regards,
James

Teradata Employee

Since this article was written a couple of year ago, I just wanted to ask if the information regarding SegIORead is still relevant for a TD 13.10 system. We have a system that exhibits high values for MemTextPageReads, but no swapping at all as defined by MemCtxtPageReads+MemCtxtPageWrites counts. You mentioned SegFileSize and Aggregate Cache are some of the factors that can be tuned; are there any other factors, specially from application and queries perspective, that can be tuned to help alleviate high MemTextPageReads values?

Thanks

Teradata Employee

I need to scrap this and start over.  Time flies. . . .

I would not be overly concerned with high SegIO Reads.  If you are not swapping, this shouldn't be a real concern and shouldn't be addressed.