A reliable rebooting mechanism for Data Mover Services

Tools covers the tools and utilities you use to work with Teradata and its supporting ecosystem. You'll find information on everything from the Teradata Eclipse plug-in to load/extract tools.
Teradata Employee

A reliable rebooting mechanism for Data Mover Services

Data Mover is a Teradata Application that allows users to copy databases or tables between Teradata systems. It is a JEE based application that is composed of three major code components: the Client (Command-line interface or Viewpoint portlet), the Daemon, and the Agent. The Daemon is the central piece of the application and the Agent is a worker unit that does the actual job of moving data. Both the Damon and the Agent are deployed in a Linux managed server. There are two other dependent Data Mover components installed in the same box:  the repository DBS for storing Data Mover job data and the Active MQ application used as the messaging service provider.

The Challenge

The four major components of the Data Mover, Daemon, Agent, Repository and Active MQ are all installed on the same Linux machine. Each component is started by an execution script created in the /etc/init.d/ directory. When the Linux machine is rebooted for any reason, the four components will be started automatically. However there are dependencies between these components. The Daemon will depend on the repository DBS, Active MQ, and network ports. The Agent will depend on Active MQ and network ports. If a dependent component is not ready (started), the Daemon or Agent will fail to start.

The Solution

There is no guarantee that the Data Mover services will be started and ready for use in the same order every time the server is rebooted. It would be useful if we could have the Daemon and Agent components keep trying to start when the required resource is not available though. This solution will work if we can take care of the following key points. Here I am using the Daemon component as the example.

  • Perform retry execution if the previous execution throws any exception related to resource availability.
  • Release the Apache Spring context and the communication ports that were grabbed in the last unsuccessful execution.
  • Print out a user friendly error message in the log file when the execution fails.
  • Have a short break/sleep time between execution failures and the next retry.
  • To prevent unlimited retries that could potentially fill up the log file and disk space, increase the Java log level to minimize the log lines after a certain number of failure counts. Once the service can be started successfully, revert back to the original log level.

This solution was implemented successfully in the Data Mover release. The Data Mover Daemon and Agent services can always start successfully using this retry mechanism to wait for their dependent resources. Both services will start working properly as soon as the internal DBS repository and the ActiveMQ service are ready for use.


Re: A reliable rebooting mechanism for Data Mover Services

Good starting point and would also help if we can add the TMSM components into the sequence.
Teradata Employee

Re: A reliable rebooting mechanism for Data Mover Services


Thanks for sharing this.  I'm quite new to DataMover and I bumped into this posting.  As I was reading along, this question came to my mind - in an optimal setup, should all DM components (ie. MQ, Agent, DM) be all in one server(ie. target database) ? I've recalled previously working for some projects that utilizes Queueing softwares and we end up separating them onto a different box for performance gains.  Was wondering if the same concept applies to TDM.  


Teradata Employee

Re: A reliable rebooting mechanism for Data Mover Services

Thanks Yuan for the article. Even though a work-around has been devised, it's good to understand the various components & the dependency between them.

Your article is a supplement for Kevin Black article: