Datg Warehouse Loading


Datg Warehouse Loading

I am new to Teradata. We have to create a TPump or Multiload script to load data into a data warehouse.
The plan is to have the source system provide comma or pipe delimited files containing records that need to be loaded into DW tables. The source system will create a new file every 2 minutes or so and drop it into a designated directory. These files will be created with a timestamp as part of the filename. The purpose of the script would be to load the data from each of these files into the DW. We have determined the format of the input records and have created a tpump script. The problem is how do we schedule the script to run on a schedule and how do we format the .IMPORT statement to read a different file each time the script is run? Which utility should be used, TPump or MLoad? We are running on WIN32.

Appreciate any help.


Re: Datg Warehouse Loading

If the row count is small (which I presume would be the case since the feed is generated every couple of minutes), then tpump is the best option. As far as filenames are concerned, normally this achieved by having a wrapper script around the utility.

For example in unix, you could use shell to figure out the filename and assign it to a variable and then use this variable as part of the utility script using there "HERE document" (<<-END) feature of shell ...

ie something like this ...


utility <<-END
.import ${myfile}