Best method to add new column to a table (without overwriting the table)

Database
Visitor

Best method to add new column to a table (without overwriting the table)

I understand the ALTER TABLE statement can add a column, but I don't understand how to populate it with data without overwriting the table. How do you fill the column with data, especially when that data needs to ID match on the record level?

5 REPLIES
Senior Apprentice

Re: Best method to add new column to a table (without overwriting the table)

Hi,

The ALTER TABLE statement itself will probably not do what you need. The only data value this statement can put into a new column is a 'default' value, either NULL or the one specified in the ALTER TABLE statement itself.

 

I think what you'll need is to either:

- use ALTER TABLE to add a default value and then use UPDATE to set the correct value for each row, OR

- create a new table, insert/select from old to new and populate the new column as part of the SELECT.

 

HTH

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Teradata Employee

Re: Best method to add new column to a table (without overwriting the table)

And creating a new table and then insert-selecting (or merging) into it is usually the fastest method.  (Then drop the old table & rename the new one.)

Re: Best method to add new column to a table (without overwriting the table)

— Clone Newtable with data and statistics

– copy stats from old to new table Both these even for a TB table are not bad in terms of CPU.

– Alter table on the original – Conditional BTEQ if the above ran well drop the clone

— That way you do not have to deal with reassigning security again on that table.

— Original table will not be available till DDL is done But MOST TB tables are accessed through views

– re-point to the clone till work is done.

— If the clone was updated then the same update needs to synch in the Altered table. BUT really for a a huge fact table mostly used for OLAP that is a unlikely scenario but still the clone will confirm this

— Perm Space will momentarily shoot up. Understanding that this is a temp operation.

Re: Best method to add new column to a table (without overwriting the table)

Please find the diffrence between ALTER TABLE or INSERT INTO…SELECT

 

Modifying an existing table is expensive and can consume a lot of resources (especially IOs).
I

we will discuss the advantages and disadvantages of the two available primary strategies: ALTER TABLE and INSERT INTO…SELECT into a copy of the table.

 

Both strategies have their individual applications and usefulness.

  • I was creating a test scenario were the table’s data block size was 127 KB. This table contained short rows (each of them consuming only a few KB). Into each data block several hundreds of rows were packed.

        The ALTER TABLE approach required 133 logical IOs, the INSERT INTO…SELECT approach logical 322 IOs.

        The ALTER TABLE was the clear winner with 59% less logical IOs.

 

        Physical IOs showed a similar picture: While the ALTER TABLE only caused 14 physical IOs, the INSERT INTO…SELECT         needed 30 IOs.

       Interestingly, CPU usage was almost identical.

 

  • I have created a second test scenario. While keeping the data block size at 127KB, I replaced the small rows with huge rows. With each row being about 64KB in size, only 1-2 rows were fitting into each data block.

        The result was impressive:

         The ALTER TABLE approach required 539 logical IOs, the INSERT INTO…SELECT method 270 853 logical IOs. The                   ALTER TABLE needed 99.8% less logical IOs than the INSERT INTO…SELECT!

       

         The difference between physical IOs was as well impressive: ALTER TABLE needed 15 physical IOs, the INSERT INTO…           SELECT 12 992!

 

If the table has indexes defined (USI, NUSI, Join Index, Hash Index), the difference between the two strategies becomes even bigger, as ALTER TABLE don’t have to create indexes for the copy of the table.

 

  • Another argument in favor of the ALTER TABLE approach is that no spool is required. INSERT INTO…SELECT needs to spool the table.

And you are no doubt wondering why you should not always use the ALTER TABLE approach after seeing these results. The answer is: ALTER TABLE comes with some limitation:

 

  1. ALTER TABLE can’t be aborted.You can’t stop it as soon as it’s started.This may lead to unexpected situations, as the following example shows: Let’s assume your workload management defines a rule which kills all sessions consuming more than 100.000 CPU seconds. What happens if your ALTER TABLE statement hits this limit?The workload management will of course send the signal to kill your session, but as no abort is possible the session will finish the ALTER TABLE activity and fail afterwards. Definitely not the workload management inventor’s intention!
  2. ALTER TABLE exclusively locks the table as long as the ALTER TABLE is active, nobody else can access the table. On the other hand, the INSERT INTO..SELECT approach only uses a read lock.
  3. ALTER TABLE takes place on the original table. This might be considered as a safety problem

Teradata ALTER TABLE or INSERT INTO…SELECT

:
Most of the ALTER TABLE’s disadvantages are related to data accessibility during the operation and in case of an error.
If you can live with these restrictions, you definitely should implement the ALTER TABLE approach.


It requires less IOs, doesn’t need spool and it’s easier to use (it’s just one statement; for example, you don’t have to create a copy of the table, run the insert into…select, recreate indexes and statistics)

Highlighted
Junior Contributor

Re: Best method to add new column to a table (without overwriting the table)