How to backtrack the extra records in target table

Analytics
Enthusiast

How to backtrack the extra records in target table

Hi Team,

 

I have TPT (mload) job to load data between two teradata tables belonging to two different servers.

The issue is even before triggering this job the target table (update/ insert type) got loaded with 10 million records of the latest run_id (unique number populates through a sequence generator for each instance). I have no clue from where these records have loaded.

 

After the TPT execution, the total count incresead to 40 million where the source of TPT job has only 30 million records. The log also shows  30 million records got loaded.

 

When I checked with DBA, they have confirmed none other users have accessed this base table expect for our USER ID which is loading data using TPT. Any idea how to trace how the extra records have loaded


1. The target base table has many views built on it. Is it possible that some records have changed in the dependent tables on which these views are built, and because of it new records got inserted to base table automatically?

 

2. Or should i check with DBA if there are any other jobs loading data to views. Is it possible to load data through views without accessing base table.

 

Any clues are welcome.

1 REPLY
Senior Apprentice

Re: How to backtrack the extra records in target table

Hi,

 

I think your DBA needs to look for commands that access things other than the base table - you can use views.

 

To answer your specific questions:


prasanth225 wrote:
1. The target base table has many views built on it. Is it possible that some records have changed in the dependent tables on which these views are built, and because of it new records got inserted to base table automatically?

This could happen but would only be because the application logic issued the relevant commands.


prasanth225 wrote:
2. Or should i check with DBA if there are any other jobs loading data to views. Is it possible to load data through views without accessing base table.

No, this is not possible. In Teradata views do not contain data, data is always in tables. If a process loaded data into a view (which is possible) then that data will end up in the base table.

 

How is your DBA checking for access to this table?  If it is using DBQL data (which is a common way to do it) then remember:

- DBQL will only record queries for those users/account/programs where there is a rule (so a process might have loaded 10m rows and it isn't recorded in DBQL).

- in the 'objectuse' logging you'll probably need to check for use of the views on this table. You only need to concentrate on single table views with no derived columns, aggregations etc. as those are the only ones that can be used to load data.

 

Is it possible that rather than loading (i.e. INSERT) 10m rows that an UPDATE command has set the run-id? You'll still need to track that down but that might give you something else to look for.

 

You might also want to look for processes that read/update whichever control table holds the latest run_id. That might give you a clue.

 

I'd ignore anything in your process because you said "even before triggering this job the target table (update/ insert type) got loaded with 10 million records of the latest run_id". If the problem existed before your process started then it isn't your process.

 

HTH
Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com