I have heard that Teradata does not like dropping and re-creating tables in production processes. In cases where a volatile table is not possible due to multipule sessions is it prefered to leave an empty table structure in place or drop and recreate it? Please provide detail as to why one is better than another if possible.
Also the data in the table is to large and would use up to much spool space to use volatile.
My thought on this: It is purely driven by business reqt. Creating and dropping of tables maybe because of statistics collection and changes of data demography etc. If it is necessary to leave an empty table to be globally shared across sessions, then I feel it is ok.
However, when performing dml operations on table by different batches, we need to take care of locks/deadlocks too :)
Drop and Create tables are DDL and they require Data dictionary intervention. (Inserting many rows in correspondig dictinary tables).
So better write a bteq in such way to check the existance of the table. If table exist don't create it but start loading. if table does not exist they only create. (can use activityount option)
After all the processing, delete the table at the end.don't drop.
Please expand on the impact of data dictionary intervention as I believe this is what I am concerned with. bteq is not an option and neither is a Volatile table.
The question is "Is there a reason I should leave the table structure in place, or can it be dropped and re-created each time the table runs?" If there is impact outside of CPU to writing to the data dictionary tables that is what I am trying to understand. Also is that impact displayed in total CPU use of creating the table and dropping it, or is it something that I am not able to see?
DBC generally is very busy to all internal important activities of teradata. so better touch those tables as less as possible.
you can see the explain plan of a create table below. you can observe a row for each column is inserted in dbc.TVfields. corresponding access right one for each type of access. this is all so heavy to insert when table is created and delete them when ever the table is dropped.
EXPLAIN CREATE TABLE TESTTABLE (COL1 INTEGER, COL3 INTEGER) PRIMARY INDEX(COL1)
We execute the following steps in parallel.
1) We do a
single-AMP ABORT test from DBC.DBase by way of the
unique primary index.
2) We do a
single-AMP ABORT test from DBC.TVM by way of the
unique primary index.
3) We do an INSERT into
DBC.Indexes (no lock required).
4) We do an INSERT into
DBC.TVFields (no lock required).
5) We do an INSERT into
DBC.TVFields (no lock required).
6) We do an INSERT into
DBC.TVM (no lock required).
7) We INSERT default rights to
Here are some reasons why you might want to reuse an existing table for a load job:
1. Creating a table, and dropping a table, are both relatively expensive operations, in terms of time, I/O, and lock contention on Data Dictionary tables. Reusing an existing table would avoid that overhead.
2. Your organization may want to separate authorization/permissions between userids that are permitted to create/drop tables, versus userids that are permitted to load data, and further control which userids are permitted to load data into which tables. You would create the table once, and grant the relevant permissions to the userids that need them. You wouldn't want to create/drop the table with each load, because the permissions would be lost.
3. As Raja said, if you use an existing table, you can keep statistics that were collected for that table. That may or may not be appropriate. If your load job loads data into the table that is very similar to the prior data (perhaps only a few changed rows), then keeping the existing statistics might be appropriate. On the other hand, if the newly-loaded data is quite different from the prior data, then the old statistics won't be useful.
I believe sgarlapa has answered your question on the steps [inserts] required to be performed while DROP/CREATE .... whereas .... its pretty much fine to do a DROP/CREATE.
DBC may be busy or may be many rows are getting inserting/deleting from DBC tables, but it doesnt make sense to keep filling useless tables in the Production DBs as well.
What is the number of tables and there columns which needed to be dropped/created .... also the frequency of this DROP/CREATE?
This is a great use case for a Global Temporary Table. You avoid the reptitive hits on the dbc tables, as once the GTT is created the ddl sticks around even after the session ends, unlike a volatile table. You also get to use the "auto-truncation", if you will, once the session ends, the data is dropped and you're left with an empty table.
Keeping the statistics around for multiple cycles will absolutely NOT help you, unless you're certain that the profile of the data is identical every time you load the table, and if that's the case, I'd question whether or not you need to re-examine the processes. You will find that in a lot of cases stale or bad statisitics are worse than no statistics.
GTT's are your ticket here.
GTT could have worked, but as the original question states .... data is large enough to cause issues for spool space.
And again, its a matter of overall load on the system .... and the number of tables that need to be dropped/created. There are several [hundreds] scenarios where staging tables are dropped/created in every load .... never caused a problem. Hence, unless the number is [really] huge .... it should just be fine.
what will happen if an BLC table droped and recreated will its BLC will remain enabled or not?