I have a need to load data in near real time every 15 minutes. Data comes in flat files and they are not going to be huge. Each file might contain to a maximum of 10 to 15 records. We can potentially have more than one file to load. Data needs to be highly available with low latency intervals at the other ends. Data validations need to be in place.
I am choosing the option of TPT stream, However my concern here is that, for data quality checks and future operational support, i need to track the file metadata to identify the source of the records. We have file timestamps from source system that uniquely identifies each file. Filetimestamps are going to be part of the file names. It seems easy for me to capture the file timestamp and load into my table if there is only one file, but when there are more than one file, if i use the wild card file naming convetion to stream the data into target tables, is there a way to capture the file metadata into a variable and use that in my streaming jobs? Appreciate your input.