Hi, Are there any know issues in using the utilities (fload, mload) to load data directly to base tables in Teradata? (instead of loading it to stage and then using BTEQ to load to base tables).
We are using Informatica to load data to Teradata. We use Loader object and load data using Fastload and Multiload. In all our loads we extract the data from MSSQL databases and load the data to a staging area in Teradata environment. once the data is loaded in the staging area we then use Teradata BTEQ scripts (out side informatica) to load the data into the actual production base tables. There are quite a few cases where we can load data into production Base tables directly from MSSQL using Informatica loader object (FASTLOAD, MULTILOAD)but we are afraid if that would cause any issues (Locking the tables, and the only way to get the table is to drop the table and recreate it). I'd appreciate if some one is following this approach, and if they ran into any issues.
Mload locks the table with the "write" lock during the APPLY phase. You can still access the table during this time if you are using "access" locking, but you will be seeing data that is "in flight" (i.e. not yet committed). You will be seeing the data before or after it has been changed at a data block level. Also, secondary indices are updated after all of the table updates are complete, so a plan that uses a secondary index could get a different answer than one that does not.
Fastload must start with an empty table and you will not be able to access the table until it is complete. If you are using Fastload to refresh the table and users need to access the table on a continuous basis, it's best to Fastload to a new version of the table and then switch your Production view to point to the new version once the Fastload is complete. You can alternate between two versions of the table each time you run your table refresh and alternate the Production view between the two.
Tpump was not mentioned in your note, but this would be another option. Tpump uses row-hash level "write" locks, so some users may be able to query the table using "read" locks (if they are looking up specific rows using the primary index). However, if they are not using primary index reads, they will have a tough time getting a "read" lock and will likely cause contention issues with the Tpump job itself. So, here again, they'd have to use "access" locking to not cause issues. If you use "access" locking while Tpump is running, you will still see data that is "in flight", but the secondary indexes are kept in sync along the way (unlike Mload), so the queries may have more consistent results. Of course, Tpump is not as efficient as Mload, especially as update volumes grow, so there is a trade-off.
Barry Thanks for the information. one quick question. I remember in the past (v2R3), there were some incidents where the MLOAD process failed while loading the data and there was no way to release the locks and the only way we could use the table was to drop the table, recreate it.( or get the backup copy and MLOAD again). Did you ever come across such incidents? and what would you recommend to avoid such incidents? Thanks
If you're referring to data corruption type issues, we haven't experienced anything like that for a long time. In any case, if you have something like that, usually NCR can resolve it if you get them involved.
I can't think of another case where you'd have to drop the table and recreate it.