This article will provide you with accurate information regarding the Teradata Parallel Transporter product. Hopefully this will educate and clear up any misunderstandings regarding the basic information about the product.
Teradata Parallel Transporter is the preferred load/unload tool for the Teradata Database.
Parallel Transporter is able to run all the bulk and continuous Teradata load/unload protocols in one product. In the past, a user had to run the protocols with separate tools with different script languages. The stand-alone load tools, FastLoad, MultiLoad, TPump, and FastExport are functionally stabilized and no new features are being added other than to keep them operational in supporting new Teradata Database releases. All new features requested by customers are being added to Parallel Transporter.
Currently, the stand-alone load tools are being supported indefinitely and no discontinuation notice has yet been issued. It is recommended that all new Teradata load applications be implemented with Parallel Transporter.
You most likely already have it. Parallel Transporter is in the Teradata Tools & Utilities software bundles included with the Teradata Database. In addition, all customers on Teradata subscription that have the legacy, stand-alone load tools are entitled to equivalent licenses for the Parallel Transporter Operators. Contact your Teradata account manager for details.
Parallel Transporter can be invoked through 4 interfaces:
Listed are the four main Parallel Transporter Operators:
If you run TPT with the script interface, a TPT infrastructure component interprets the script and invokes the proper Operators to read and load the data.
If you use and ETL tool, the ETL tool will read and transform the data and pass the data in memory to the TPT API interface which will invoke the proper Operator to load the data.
That’s easy to answer. The three main benefits are performance, ease of use, and better ETL tool integration.
- Performance: As you already know from your use of the Teradata Database, the best performance is scalable performance. The architecture of Parallel Transporter allows the processes running on the client load server to be scaled and parallel data streams can be created to circumvent performance bottlenecks.
For example, if I/O is a bottleneck when reading a very large input data file, then one can scale Parallel Transporter to create multiple data flows with multiple readers of the same file or multiple files to create more data throughput for the load.
- Ease of use:There are many features when using the script interface that makes writing a load job much easier.
Example 1: One script can extract data from a production Teradata Database and load into a test database. The data will flow in memory among the parallel processes on the client load server. With the stand-alone tools one would have to write two scripts in two different languages and put a problematic named pipe in between the two tools to pass data.
Example 2: One script can load a Teradata Database and the user can determine which load protocol to use at run time without having specified the load protocols in the script. This allows one to easily switch between load protocols at run time using just one script.
- ETL tool integration:The leading ETL vendors now have more control over the entire load process when they integrate with TPT API. The vendors are urging their customers to use this interface. Contact an ETL vendor for more details.
If you are using an ETL tool, then you don’t have much to learn since the ETL tool automatically works with TPT. Once the user has entered the ETL data flow into the ETL tool’s GUI, the ETL tool will automatically generate the appropriate calls to Parallel Transporter’s API interface (TPT API). The input data is passed in data buffers that reside in memory from the ETL tool to TPT API without having to land the data or deal with problematic named pipes.
If you are writing your own scripts, most everything about the stand-alone load tools still applies such as the same basic options and parameters, limitations (e.g., number of concurrent load jobs), when to use the protocol, etc. Mostly, you have to learn the new script language and the tlogview tool. The tlogview tool allows the user to view and make sense of the output generated from many parallel processes that are executed in the Parallel Transporter job.
That’s a good question. Many Teradata customers that use Parallel Transporter have asked why more customers haven’t leveraged the advantages of the tool. Every Teradata Partners Conference since 2006 has had a customer present their success story including the Partners 2010 conference. Contact your Teradata Account Manager on how to attend the Teradata Partners Conference.
Parallel Transporter is the preferred load tool whether using it with an ETL vendor or writing your own scripts.
The stand-alone load tools are frozen and no new major features are being added.
Check with your Teradata account manager since you most likely already have the proper licenses for Parallel Transporter.
Teradata Education Network, www.teradata.com/t/TEN, has a Parallel Transporter technical tips and techniques presentation and a web-based training class. A white paper, Active Data Warehousing with Teradata Parallel Transporter, is available from www.teradata.com.
If you are interested in increased performance, improved ease of use, a better interface to your ETL tool, or new load tool features, then install Parallel Transporter and get started on writing new parallel load applications for the Teradata Database.