We were exporting a Teradata table with huge volume of data using FastExport.Now we are planning to migrate the system using TPT.
Our output file is a pipe dilimited Text file.As per the TPT documentation if we use the export operator to offload the data it cannot be done in TEXT format.It was suggested to use the "Sql select" operator instead but it was also mentioned that "Sql select" operator is not suitable for high volume of data.We are using TPT 14.0.
You'll want to use the format of 'Delimited' in the DataConnector operator. As part of that, you can specify what the delimiter is.
You can use either the 'Export' operator or the 'SQL Selector' operator along with that. The 'Export' operator is recommended for large files since you can specify multiple sessions for the retrieval of the data.
But I was wondering whether moving to TPT from FastExport will lead to any gains from performance standpoint since we are talking about huge volume of data (TB's).The main idea behind the approach is movement from FastExport to TPT is improvement in performance?
The TPT 'Export' operator is essentially running Fast Export under the covers. So, the performance within the database should be the same as Fast Export. The 'SQL Selector' operator is similar to running a BTEQ Export (limited to one session).
As Deiter noted, because of TPT's ability to have multiple instances, you may be able to get better performance if you set it up to utilize multiple instances. I have never had a need to do this, so I can't really speak to any additional performance that you might get by doing this.
As pointed out clearly the 'Export' operator uses FastExport under the covers but may perform better because of its ability to use multiple instances.But the following aspects are not very clear:
TPT's advantage over FastExport can be demonstrated in the following scenario. Let's say that you have a large FastExport that you are executing and you determine during the execution that the bottleneck is in the disk in your host environment. So, FastExport is delivering the data to the host faster than the disk on your host environment can write to the file. One thing you could do to resolve this is to create two FastExport jobs, and have each one select approximately half the data and write to two different files contained on different volumes. By doing this, I could possibly cut my export time in half. So, you were able to get the data exported faster by running two separate FastExport jobs.
With TPT, you can do the same thing within one TPT job. There's no need to create the two TPT jobs. By specifying multiple instances, you can effectively do this in one TPT job.
In answer to the second question about the format, it works the same regardless of the format of the file.
In addition to the excellent posts above, when it comes to writing out the data in delimited format, the advantage of TPT over FastExport is that TPT supports delimited output and FastExport does not. And since FastExport cannot write out the data in delimited format, TPT's performance (just for delimited) may be immaterial. :)
TPT can also write out to multiple files in parallel. Some users may prefer multiple smaller files, rather than 1 large file.