This article describes how to preserve data integrity when a Teradata Parallel Transporter (Teradata PT) export to stream job restarts.
Teradata PT can export data from a Teradata source table and load the data into a different Teradata target table in a single Teradata PT job without landing the data to a disk file. This sounds great but care must be taken to ensure data integrity.
When the exported data is unordered and there is a job restart during the middle of the export, the target table can have undesirable results at the end of the job.
Why? You might ask. It is because the Teradata Database does not guarantee the same order of the exported data after a restart.
The Stream operator sends the data to the Teradata Database and the data is committed as it is loaded. The Stream operator keeps track the number of records that has been sent. When there is a restart, the Stream operator will resume from the last sent record number.
A job is not checkpoint/restartable when using the TPT Export operator or SQL Selector operator. So after a restart, the SELECT statement in the SelectStmt attribute is sent again and all of the rows will be exported from the Teradata Database again.
This issue occurs when all of the following conditions are met:
This problem can also occur if the source table changes between the original export and the re-issued export after restart.
If the target table is a SET table, the two results are:
If the target table is a MULTISET table, the two results are:
Note: The target table cannot be rolled back because the Stream operator sends the data to the Teradata Database and the Teradata Database applies the data on the target table.
HOW TO AVOID THE ISSUE
There are two ways to avoid the issue:
Note: Use of the TPT Update operator will still have a problem if the user tried to checkpoint. An export-to-update job cannot have checkpointing because the Export operator cannot restart in the middle of the export. The Export operator exports all of the data or none of the data.
Export the Data to a Disk File
The first way is to export the data to a file. You would need to use the TPT Data Connector operator to write the data to a file. You would need to use the “Write” option for the TPT Data Connector’s OpenMode attribute.
When a restart occurs in the middle of the job, the TPT Data Connector operator rewrites the entire file. When the job completes, the data file will have all of the correct rows as long as the user did not select the “WriteAppend” option for the OpenMode attribute.
The data file can then be loaded into a target table. You would need to use the TPT Data Connector operator to read the data from the file.
When a restart occurs in the middle of the loading, the TPT Data Connector operator guarantees the same order of the data read from the file. When the loading completes, the target table will have the correct result.
Use the TPT Update Operator
The second way is to use the TPT Update operator instead of the TPT Stream operator.
The TPT Update operator is different from the TPT Stream operator.
The TPT Update operator sends the data to the Teradata Database in the acquisition phase. In the acquisition phase, the Teradata Database populates the data to a temporary work table. After the acquisition phase, the Teradata Database applies the data from the work table to the target table. This is called the application phase.
When the user does not specify a checkpoint and a restart occurs, the TPT Update operator will restart the loading from the beginning and the Teradata Database will populate the work table from the beginning. After the application phase, the target table will have the correct result as long as the user did not specify a checkpoint.
Note: The job is not checkpoint/restartable when using the TPT Export operator or SQL Selector operator.
For information on the Teradata PT operators, see Teradata Parallel Transporter Reference.
For information on Teradata PT examples, see Teradata Parallel Transporter User Guide.
I want to copy a database/schema´s user from one machine to another machine because I have development environment and i want to copy the same structure and data from development environment to production environment which are in different machines. How I can do this ???
I have ABU Aplicattion for the backup
i want to know how I can copy a database or user schema from the development environment to the production environment.
I want to do export of all structure of a database and do the import in other database in other machine
I have ABU Aplicattion for the backup
Can i do this ?? Because i saw how copy tables, but i want to copy the all schema of a user /database
Isn´t there other way for I can do this??? Because the aplication Teradata Data Mover isn´t free and i can´t pay a license
Is it possible do export (from development environment) of a user and your objects and then do import in production environment???
TPT does not support the export of a user database.
TPT does support the export of user tables from one Teradata Database system to another Teradata Database system. See my article at this URL: http://developer.teradata.com/tools/articles/use-teradata-parallel-transporter-to-move-data-without-...