Alter table on huge table

Database
Enthusiast

Alter table on huge table

There is a requirement to add three NULLable columns at the end of existing table which is of volume about 1TB.

I would like to know what is the best approach to perform this activity if there is a time window of 3-4 hrs to add the column to the table?

I read the below article:

http://www.teradatamagazine.com/table_dressing.aspx

It says..

"The ALTER TABLE process can be very quick if a column is added that is defined as NULL, has available presence bits and is not back-filled with data. In this case, ALTER TABLE is strictly limited to a table header change. In addition, data blocks are not rewritten."

Under certain circumstances (?) , however, this practice can cause every row in the table to be rewritten, making it a potentially significant database operation. A rewrite can take hours on a large table, even with the highly optimized, block-oriented method used by the Teradata Database. In addition, when an ALTER TABLE request is aborted, a rollback will not occur and the operation will complete.

Couple of questions arises in my mind:

1. Under what circumstances every row in the table to be rewritten? To my understanding is  that if we add NULLABLE column then this would not happen. correct me if I am wrong here.

2. What happen if during the process of ddl (alter table..) it takes longer time then expected?  Does failure causes any data corruption?

3. What happens internally if we do "alter table ..." ? It would be nice if you can provide document link to understand the concept on Teradata Internals to understand the behaviour. To my understanding, if we do alter , the data dictionary is updated internally and mechanism is quite faster as it knows that we are just adding a NULLable column and hence does not update every row after adding the column.

Thanks!

6 REPLIES
Enthusiast

Re: Alter table on huge table

Hi,

Whenever we have such a big table to alter, we dont use ALTER stmnt. We just drop and recreate it. Since the PI of source and target is same, it will be a fast path insert and would be fast. I 1 TB table should not take more than 40 min, i believe. If the table is skewed, the insert might take a while, but since you have 3-4 hrs window, you should be good.

--samir singh

Enthusiast

Re: Alter table on huge table

Thanks Samir for your opinion on this.

Couple of questions arises in my mind as I have already mentioned in my previous post:

1. Under what circumstances every row in the table to be rewritten? To my understanding is  that if we add NULLABLE column then this would not happen. correct me if I am wrong here.

2. What happen if during the process of ddl (alter table..) it takes longer time then expected?  Does failure causes any data corruption?

3. What happens internally if we do "alter table ..." ? It would be nice if you can provide document link to understand the concept on Teradata Internals to understand the behaviour. To my understanding, if we do alter , the data dictionary is updated internally and mechanism is quite faster as it knows that we are just adding a NULLable column and hence does not update every row after adding the column.

Can you please help to clarify my doubt on this? Thanks!

Enthusiast

Re: Alter table on huge table

Hi,

I never delved much deeper in Alter, as in our installation, we generally use drop recreate rather than alter.

--samir

Teradata Employee

Re: Alter table on huge table

I assume that @samir means that they usually do:

INSERT SELECT old table to new table

Rename old table to save table

rename new table to old table

drop save table when satisfied that result is good.

This method is used by many customers because it keeps the old table until the new table is verified and allows the process to be aborted if absolutely necesary.

Unfortunately it is difficult to know if the ALTER TABLE can add the nullable columns quickly or not. To figure that out one needs to figure out if there are enough null indicator bits available to represent the new columns without adding indicator bytes. If another indicator byte is neccessary, then the rows needs to be re-structured which requires a pass through the table. It may be possible to prototype it by making an exact copy of the table and putting a smaller subset of the data in it, then perform the ALTER on the test table to see if it has to go through the rows or not.

How does ALTER work:

- Read a whole file allocation unit ("cylinder") from the table into memory

- Build a new cylinder in memory by copying each row from the old cylinder, reformatting it as necessary with the ALTER changes and placing it into the new cylinder

- Write the new cylinder to disk.

- When verified to be written, delete the old cylinder and record the position in the table that has been completed.

- Repeat above steps until the whole table has been processed.

- Of course this process is executed on all AMPs simultaneously so that the full paralleism of the system is utilized to get the job done.

- If any kind of failure occurs (eg if the system restarts), the recorded position is retrieved and the process restarts from that point and runs to completion.

The advantages of ALTER are:

- Operating a cylinder at a time means that a minimum of additonal space is required to reformat the table. Using the INSERT SELECT method requires space for two times the size of the table and a spool file the size of the table as well.

- Keeping track of the point of completion and having the restart logic means that any interruption does not require re-doing the whole process.

- cylinder by cylinder operation is more efficient than spooling the table and then writing it to the new table.

Disadvantages:

- The process cannot be stopped once it is started if for instance it is running longer than expected and is blocking other work. The INSERT SELECT can be halted any time (but needs to be redone from scratch if the INSERT didn't complete).

- ALTER places an Exclusive lock on the table for the entire period of its execution. INSERT SELECT method can place a read lock allowing other readers to continue to use the old table during the process.

Senior Apprentice

Re: Alter table on huge table

Comparing pros & cons of Alter, Merge, Insert/Select:  

Why don't you use ALTER TABLE to alter a table?

Fan

Re: Alter table on huge table

Hi Dnoeth.   Is there any way to determine or estimate when the Tables Locks on DBC will occur?  What will be the impact when the do occur?

 

We are considering ALTER on a 20+ TB table to since we don't need to rebuild the AJI's or SI's this time.  

Do you have any thoughts on this?

Thanks. Steve