I have designed a TPT script to load data from a set of specific .txt files to a table within a database in Teradata and it works fine. The general structure of this script involves the definition of the DDL, File Reader and SQL Inserter operators, in a way that the mentioned load can be easily implemented by the following statements:
('INSERT INTO ' || @table || ' (
TO OPERATOR ( SQL_INSERTER  )
SELECT * FROM OPERATOR ( FILE_READER_1  )
SELECT * FROM OPERATOR ( FILE_READER_2  );"
where two File Reader operators (FILE_READER_1 and FILE_READER_2) have been previously defined in order to load the file_1.txt and the file_2.txt into the table by applying the statements above.
Nevertheless, as I understand it, such a precedure implies the definition of multiple File Reader operators (one for each file to load). Therefore, I would like to implement an alternative procedure provided by a unix shell script to execute a single TPT load job (i.e. with a unique FILE_READER operator) for each file existing in a specific directory (e.g. a variable set of .txt files, which can be named as DATA_YYYYMMDD.txt, within a specific directory).
Is there any sample unix script available to achieve such a procedure?
Thanks in advance.
Solved! Go to Solution.
Thank you for the remarks.
The specific definition of the Data Connector operator within my .tpt file is:
DEFINE OPERATOR FILE_READER
TYPE DATACONNECTOR PRODUCER
VARCHAR DirectoryPath = @DIR_SOURCE
VARCHAR FileName = @FILE_SOURCE,
VARCHAR IndicatorMode = 'N',
VARCHAR DateForm = 'AnsiDate',
VARCHAR OpenMode = 'Read',
VARCHAR Format = 'Delimited',
VARCHAR TextDelimiter = ';'
where the variables "DIR_SOURCE" and "FILE_SOURCE" refer to the directory path and the file name of the specific file to load into the target table, respectively. Hence, for instance, if FILE_SOURCE='FILE_20190315.txt', then this .tpt job loads data into the target table from that specific file.
In this sense, following your lines, I have renamed the variable FILE_SOURCE='FILE_*' (instead of FILE_SOURCE='FILE_20190315.txt' alone) of my script_jobvar.txt file and I have easily loaded a list of 'FILE_YYYYMMDD.txt' files with a single .tpt script into a target table.
Thank you very much again.
I am experimenting a new problem with the previous procedure, since I could load data from a large list of 'FILE_YYYYMMDD.txt' files, but it fails when I execute again the process for a different list of files.
For example, I have two 'FILE_20190314.txt' and 'FILE_20190315.txt' files within the directory 'C:\directory'. Then I execute my .tpt script to load data into a target table from these 'FILE_20190314.txt' and 'FILE_20190315.txt' files by setting VARCHAR FileName = 'FILE_*' (or, equivalently, FILE_SOURCE='FILE_*', where FILE_SOURCE is a defined variable related to the attribute FileName of the Data Connector operator) and it works completely fine.
However, when I delete the previous 'FILE_20190314.txt' and 'FILE_20190315.txt' files and I try to repeat this procedure from different files (e.g. 'FILE_20190316.txt' and 'FILE_20190317.txt' files), then it fails (the system requires the original 'FILE_20190314.txt' and 'FILE_20190315.txt' files to execute the job, apart from the new 'FILE_20190314.txt' and 'FILE_20190315.txt' files):
Therefore, the job "remembers" the original sources and it cannot load data from a distinct list of files. However, in the second job, I only want to load 'FILE_20190316.txt' and 'FILE_20190317.txt' files.
I have just solved this issue by introducing the attribute "VARCHAR ArchiveDirectoryPath" within the Data Connector operator, so that the respective 'FILE_20190314.txt' and 'FILE_20190315.txt' files of the first process are moved to this Archive Directory Path and the new 'FILE_20190316.txt' and 'FILE_20190317.txt' files can be loaded appropriately in the second process.
Thank you very much again.