We are now using TPT and we have a requirement to load multiple files at the same time. In the beginning we use the (*) in the file name but we have a challenge of using the wildcard character (*). Anyway to use TPT load data file contain regular expression such as ABC_AAA_[0-9].dat, ABC_AAA_BBB_[0-9][0-9].dat?
Here is the example case:
File layout - 1
File layout - 2
These 2 file layouts are located in the same directory. If we define the FileName = 'ABC_AAA_*.dat' in TPT script, all 5 source files will be considered by TPT to load and it will cause failure because the 2nd layout is different from 1st layout.
Appearantly this regular expression for file names should work, because the * means anything in the file name. Can you please paste the actual error you are facing.
Also consult the following post, hope it will help you.
TPT does not support "ABC_AAA_[0-9].dat" syntax.
If you want to use the wildcard syntax, all files must adhere to the same layout.
Thus, in this particular scenario, due to the way the files are named, you would need to separate out the files with the different layouts into separate directories.
Another thought: If you are trying to load the data from all 5 files, but you know you need to use 2 different load tasks to accomplish the job, you can use one step to load the files from ABC_AAA_BBB_*.dat, then use a subsequent TPT job step to move those files to an archive directory, then use yet another job step to load the data from ABC_AAA_*.dat.
Thanks Feinholz, we found another alternative solution with using FileList attribute of DataConnector operator and use Shell script to generate the list of real file name from the regular expression.
But I got a problem when include full path together with the file name in the list of file. TPT return error message said file not found.
TPT_DATACONNECTOR_OPERATOR: TPT19404 pmOpen failed. Requested file not found (4)
If I remove full path from the list of file name and run TPT in the same path as data file, TPT will run successfully.
I do not understand what is wrong when I defined the full path (I test with VMWare, TPT run on Window) as below in the list of file name. Could you please give me an advice how to correct it.
TPT - Data Connector Operator:
DEFINE OPERATOR TPT_DATACONNECTOR_OPERATOR
TYPE DATACONNECTOR PRODUCER
IndicatorMode = 'N',
TextDelimiter = '|',
Format = 'delimited',
RowErrFileName = 'E:\90_Temp\BAD_DATA.dat',
VARCHAR PRIVATELOGNAME = 'DATACONNECTOR_OPERATOR_LOG' ,
VARCHAR DIRECTORYPATH = 'E:\90_Temp\Test_Name\' ,
VARCHAR FILELIST = 'Y',
VARCHAR FILENAME = 'TPT_DATA_FILE_LIST.txt'
Even if this is my first day in this forum, I have an idea concerning Your question.
The Teradata Parallel Transporter 13.10 User Guide.pdf page 156 reads
"If the pathname that you specify with the FileName attribute (as filename) contains any
embedded pathname syntax (“/ “on a UNIX OS or “\” on Windows), the pathname is
accepted as the entire pathname.
However, if the DirectoryPath attribute is present, the
FileName attribute is ignored, and a warning message is issued."
I cannot test it (my first time with TPT, please excuse me), but I would try:
First idea: Use "\<servername>" instead of "E:"
Second idea (if 1st idea fails):
Drop Attribute DIRECTORYPATH
and use VARCHAR FILENAME = '<path>\TPT_DATA_FILE_LIST.txt'
If this doesn't help I am confused too, because "Teradata Parallel Transporter 13.10 Reference.pdf" page 93 reads
"When used with the FileList attribute, filename is expected to contain a list
of names of the files to be processed, each with a full path specification."
Please excuse me if this doesn't help. By now I have no possibility to test it.
Best regards, Wolfgang.