Updating large table without insert/delete

Database
Enthusiast

Updating large table without insert/delete

Hi

 

I am performing an update on a table with can have 100 millions of rows. Currently, there are only 26 million rows in the table and i am updating 1 lac rows at a time in a while loop using the primary key values (which are sequential) and it is approximately taking 7 hours which is very slow as compared to SQL or Oracle. If i update the whole table together, it gets completed in 3.5 hours but updating the whole table might not solve my problem as i have to show the progress of the query simultaneously.

 

Can any one please guide what can be done to improve the performance.

 

 

8 REPLIES
Teradata Employee

Re: Updating large table without insert/delete

Utilizing single row operations to do large numbers of changes defeats all of the parallelism and power of Teradata. 

 

The right strategy for a large volume of updates is: load the update data into a work table using the Teradata load utilities (eg TPT Load operator), perform join update as a single statement. 

 

This strategy will exploit the power of parallel set based operations in Teradata.

Junior Contributor

Re: Updating large table without insert/delete

Can you show the DDL of the base table and the actual UPDATE loop?

 

Enthusiast

Re: Updating large table without insert/delete

Hi

 

the loop is something like this - 

set v_PageSize = 100000;
set v_RowsSelected = 100000;
while (v_TotalRows >= v_PageNumber) DO
begin
if v_PageNumber <> v_TotalRows then
Select emp_id into v_LastRow From Employee20Lac_RID Where ID = v_PageNumber;
else
SET v_LastRow=null;
end if;

if v_FirstRow IS NULL then
begin
update T FROM nm_target2.Employee20Lac AS T
SET
Name = some_function;

set v_RowsSelected = ACTIVITY_COUNT;

end;
elseif v_LastRow IS NULL then
begin
update T FROM Employee20Lac AS T
SET
Name = some_function
where emp_id > v_FirstRow;

set v_RowsSelected = ACTIVITY_COUNT;

end;
else
begin
update T FROM Employee20Lac AS T
SET
Name = some_function
where T.emp_id > v_FirstRow and T.emp_id <= v_LastRow;

set v_RowsSelected = ACTIVITY_COUNT;

end;
end if;

set v_FirstRow = v_LastRow;
set v_PageNumber = v_PageNumber + 1;
end;
end while;

Junior Contributor

Re: Updating large table without insert/delete

If i update the whole table together, it gets completed in 3.5 hours

 

Does this means a single UPDATE statement? 

You didn't show the target table indexes, if you update a Primary Index/Partitioning column it will result in bad performance, but I can't imagine a runtime of 3.5 hours (what's your system specs, number of AMPs, etc.).

 

 

I don't get why you think you must run the update in batches of 100000 (remember, this is not Oracle or SQL Server).

Your code actually requires a Full Table Scan for each loop (there's no sequential Primary Key in Teradata as it's a hash based Primary Index).

 

Re: Updating large table without insert/delete

Yeah, its updating the table row by row, which is not taking the advatage of Teradata parllalizam, apart from that  

 

Couple of things you need to look at...

1: Check if the table is SET, if it is SET table, for each duplicate PI value, teradata does the duplicate row check (Since you have PK, SET does not impact your update)

2: Do you have any additional indexes on target table like Seconday Indexes/Join Indexes, updates and Insert takes nx times (n is the number of indexes) 

3: I am wondering if you could implement the same logic using Teradata MERGE INTO (I am not sure how well it works in your condition) but this would speed up your update... 

 

 

Teradata Employee

Re: Updating large table without insert/delete

Updating in this manner is suicidal.  (See http://developer.teradata.com/blog/geoc/2011/04/set-processing)

 

It looks like you should just be doing a simple set SQL statement:

     update Employee20Lac SET Name = some_function

although I strongly suspect that "some_function" is something that reads another table.  In that case you need to turn this into an update-join.  See my other posts at http://developer.teradata.com/blog/georgecoleman for ideas on how to do this.  Also, vishnukilari makes some good points, especially about using Merge instead of Update - might be even faster.

 

 

Enthusiast

Re: Updating large table without insert/delete

Hi,

 

Thanks.

Is it possible by any way that i can get the rows updated simultaneously while updating the whole table together, as otherwise it would keep the user waiting for hours in case of 100 millions of rows. So any way i can show the progress to the user on UI

Junior Contributor

Re: Updating large table without insert/delete

You still didn't provide more details on DDL.

 

- DDL

- Explain of the plain Update

- What's this some_function?

 

 

If you actually need 3.5 hours for a simple update to 26,000,000 rows (i.e. 2000 rows/second) there's something wrong, this should be running in a few minutes.