I am an application developer (not a DBA), and we have a large OLAP Teradata database that we are writing a multi-tier application against. I am brand new to Teradata.
Our Teradata data warehouse schema seems overly complicated. All views must go into one database, all procs into another, all macros into a third, and all actual data goes in a fourth. This is per environment, so these four databases must exist for the typical dev, test, qa, and production instances of the application. The names of these databases are determined by the DBAs and so far follow no automatically recognizable pattern, so it cannot be assumed that in the Dev environment, our app's procs should SELECT against, say, the OURAPPNAME_DATA_DEV database. There are also the more typical security rules, such as SELECTS are only granted to views (not tables) and insert/updates can only occur via procs or macros. This means database name hardcoding must occur when promoting our application from one domain to the next, which kills automated deployment and adds risk, or a lookup into an environment table must occur each time a proc or view is run so it can "discover" the name of the data database, which is very expensive.
Is "all views in a views DB, all procs in a procs DB, all data in a data DB, and DB names follow no pattern" really a Teradata best practice? If it is, please explain the why the trade-off leans towards the DBA's routine tasks (easier backup & recovery?) vs. the overhead this creates for multi-tiered applications (not just adhoc queryers).
Maybe I should be asking what the Best Practice is for me to safely account for all this.
Thanks in advance.
"DB names follow no pattern" is not a Teradata best practice.
Please work with your DBAs to design a naming convention for databases that makes sense for your application, and that makes it easy for your application to be promoted through your dev, test, qa, and production instances.
Some DBAs insist on a prefix or suffix for database names that clearly identify the instance -- dev versus prod. That kind of naming convention is intended to prevent catastrophic accidents in production. The assumption is that when a DBA types the "prod" prefix or suffix, their brain will engage and realize that it's a "big deal".
Ideally, your database names would be the same across your dev, test, qa, and production instances, except for the "_dev", "_test", "_qa", or "_prod" suffix. At startup time, your application would obtain and cache the instance-specific suffix. That should be a one-time operation at application startup, and stored in a global variable or JVM system property, or something equivalent.
There are several different ways that the instance-specific suffix could be provided to your application. A few possibilities are: a JVM command-line option -D to set a JVM system property, or a properties file that is loaded once at startup, or a query that is executed once and the value stored in a global variable.
Thank you; I completely agree that your solution is ideal. :-)
Is there a way to create a generically named link or alias to a database? I see "DB Link" being talked about but haven't found out exactly what that does yet. If we can have consistently named aliases created for the database names on the same server instance (effectively reproducing your _dev, _test, _prod example with links that point to various actual databases), that would work wonderfully.
>>> Is there a way to create a generically named link or alias to a database?
No, the Teradata Database does not offer that feature.
>>> I see "DB Link" being talked about but haven't found out exactly what that does yet.
When people use the term "DBLink", they typically mean a feature similar to Oracle's database link, that permits one Oracle database to connect to another Oracle database. http://www.orafaq.com/wiki/Database_link
Teradata offers an equivalent feature called QueryGrid, which permits a Teradata Database to connect to other systems, such as Aster, Hadoop, or another Teradata system. QueryGrid does not provide the kind of functionality that you are asking about.
OK, again thank you. The best solution we have so far is to wrap all our "CREATE VIEW" DDL in stored procedures, and to pass the various required database names in as arguments depending on environment. This is not ideal, but it should work. If there is a way for a view to dynamically "know" or otherwise accept the name of another database at runtime, please let me know.
I'd like to reiterate a part of your question:
Is "all views in a views DB, all procs in a procs DB, all data in a data DB, and DB names follow no pattern" really a Teradata best practice?
If it is, what are the trade-offs vs. the overhead this creates (extra name spaces, extra permission complications).
I'm interested in how many environments follow that practice.
I can tell you another downside. A mistake was made in database creation whereas not enough space was allocated for our VIEWS database, resulting in our view creation script failing. This tells me our DBAs are watching disk space very closely. However, since I must assume every additional database comes with some amount of disk overhead, I'm not sure the extra monitoring saves any space.
IMHO Tom Nolan already answered the "DB names follow no pattern" part, there are naming conventions, of course.
It is also common practice to seperate tables from user access, i.e. tables are within a database and views/macros/SPs in another (but usually only one). Only load users get write access on the tables (on database level) while end users get read access via the view-db.
A View database needs zero perm space :)