Is your Internal Monitor monitoring (TMSM 13.10)?

The UDA channel is for Teradata’s Unified Data Architecture including the Analytical Ecosystem and other UDA influences. This channel provides information specific to the integration and co-existence of multiple systems, in particular when a mix of Aster, Teradata, and Hadoop are present. It is also meant to support information around the UDA enabling technologies so products like Viewpoint, Data Mover, Connectors, QueryGrid, etc.
Teradata Employee

Is your Internal Monitor monitoring (TMSM 13.10)?

The Teradata Multi-System Manager (TMSM) product monitors and controls hardware components, processes, and data loads.  Ever wonder who is monitoring the monitor? The Internal Monitor should not be confused with the external fail over monitor, the fail over monitor is responsible for monitoring the TMSM Master.

This article would be useful to anyone attempting to better understand the TMSM Internal Monitor.

The TMSM  server is monitored by an Internal Monitor, this monitor feature is installed and configured with default values. This article provides insight into these:

Where can one see this Internal Monitor?

The TMSM Ecosystem Health portlet allows one to montor and control hardware component health information and metrics. The Ecosystem Health portlet provides operational views for monitoring the state and condition of all instrumented hardware components including the TMSM server itself.

Access to the TMSM Ecosystem Health portlet is by way of Teradata Viewpoint, this is a example:

The above view is at the highest level, we need to drill into Ecosystem Atlanta to find the TMSM server with a double-click upon the Atlanta icon. Navigating to the detail view of the TMSM server this view is presented:

WOW !! Look at all those metrics. The Internal Monitor is responsible for gathering and providing the above metrics to the Ecosystem Health portlet.

Is the Internal Monitor scheduled to run?

While exploring the Ecosystem Health portlet for the TMSM server it is observered that one or more metrics appear to be suspect. Maybe the Internal Monitor is not scheduled or needs to be a adjusted. Start with a check of the crontab to determine if an entry exists. These steps will help display the settings for the Internal Monitor. After logging into the TMSM server as syncuser, type this command on the command line:

crontab -l | grep load_internal_events

 You should see something like this displayed:

*/2 * * * * /opt/teradata/client/tmsm/bin/ > /opt/teradata/client/tmsm/logs/load_internal_events.log 2>&1

If the above appears then the Internal Monitor is scheduled to run every 2 minutes. If the above does not appear please read the following section.

How can the default values be adjusted for my TMSM server?

There may be a need to adjust the schedule for the Internal Monitor. The TMSM Internal Monitor can be scheduled to run (or adjusted) by using the TMSM portlets. Sign on to Teradata Viewpoint and go to Admin > MSM Setup.

Go to Global Parameters > Manage Global Parameters. The Global Parameters portlet appears, it should look something like this:

Now you can make the needed changes or adjustments. The cron schedule shown above is for a 2 minute monitoring sequence.

Is the Internal Monitor running?

Checking to see if the Internal Monitor is running can be accomplished with these steps my monitoring the Internal Monitor file, now you are the monitor (HA!!HA!!).

Check for new enties in the Internal Monitor file, internalmonitor.dat. The Internal Monitor file contains the gathered metrics for the TMSM server.

The file is populated when the load_internal_events.shis fired by the cron. After logging into the TMSM server as syncuser, type this command on the command line:

ls -lart $TMSM_HOME/bin | grep internalmonitor.dat

You should see something like this displayed:

-rw-r-----  1 syncuser tdatudf     297 Jul 11 22:34 internalmonitor.dat

The file should have a recent timestamp to indicate that it has run in the recent past based on  the crontab settings.

Execute the above command after waiting a few minutes, wait time depends on cron schedule, the file timestamp should be updated.

Please note with a Dual TMSM configuration (Master/Slave) the Master performs the collection of Slave Internal Monitor data.

The filename used for the Internal Monitor Slave is remotemonitor.datand can be found with this command,

ls -lart $TMSM_HOME/bin | grep remotemonitor.dat

-rw-r-----  1 syncuser tdatudf     297 Jul 11 22:35 remotemonitor.dat

What are the values being reported to TMSM?

Further investigation into the Internal Monitor we can view the contents of the Internal Monitor file, internalmonitor.dat, which is used to report the metric values to TMSM Ecosystem Health portlet.

Here is one approach to monitor the Internal Monitor. After logging into the TMSM server as syncuser, type this command on the command line:

watch more /opt/teradata/client/tmsm/bin/internalmonitor.dat

You should see something like the below displayed (use CTL + Cto stop command).

Every 2.0s: more /opt/teradata/client/tmsm/bin/inte...  Thu Mar 29 18:13:44 2012

<your TMSM server>,Heartbeat,0

<your TMSM server>,UsedDisk,36

<your TMSM server>,UsedDisk2,3

<your TMSM server>,UsedDisk3,10

<your TMSM server>,UsedCPU,5

<your TMSM server>,UsedMemory,35

<your TMSM server>,QueueDepth,0

<your TMSM server>,D-Control,0

<your TMSM server>,D-EventConsumer,0

<your TMSM server>,D-ControlListener,0

<your TMSM server>,D-Publisher,0

<your TMSM server>,D-MessageBus,0

The format is: <your TMSM server>, <metric name>, <value>

Typically the entry of QueueDepth, is of interest as this represents the Active MQ Pending Queue depth. When the QueueDepthis greater than 0, it implies that the TMSM Master is not consumming the contents of the Queue. If the QueueDepthis changing values that could be a good thing!!

What do the metric names represent?

Based on the second screen snap shot in section "Where can one see this Internal Monitor?" the metric names displayed are:

D-CONTROL TMSM Control Master
D-MESSAGEBUS Active MQ Message bus
DBUSEDPERM TMSM Repository Perm Space
STATECONTROL State control alerts will be displayed
USEDCPU TMSM server Used Cpu
USEDMEMORY TMSM server memory used
HEARTBEAT TMSM Component heart beat for health check
QUEDEPTH Active MQ Message bus queue depth

Where can I find out more information about this feature and other TMSM features?

More information can be found in Teradata Multi-System Manager User Guide (.PDF, 3MB).


Re: Is your Internal Monitor monitoring (TMSM 13.10)?

Nice article Gary, please keep them coming. -Tom