My company produces a data analytics tool that supports various RDBMSs, including Teradata. We are running into a performance problem with querying the metadata for tables and views, but views in particular (we are running Teradata 15).
Originally, we used JDBC's Connection.getMetadata().getColumns() to return this metadata (column name, data type, nullability, etc.). However, we found the performance of the underlying query to be quite slow when we have a data warehouse with thousands of tables and views, each with thousands of columns.
To work around this performance issue, I investigated a variety of different approaches, each faster than the original query generated by JDBC, but still not performing as well as we'd like.
Option #1: HELP TABLE/VIEW <dbname>.<tablename>;
This returns a ResultSet with all of the column metadata that we need and works quite well for tables. The one major drawback is that, for views, there is only column name information, and there is no data type information. Both HELP TABLE and HELP VIEW return the same results for views.
Option #2: HELP COLUMN <dbname>.<tablename>.*;
Unlike HELP TABLE, HELP COLUMN works for views as well as tables. The information returned is very similar to that of HELP TABLE. The only drawback is that, for tables with thousands of columns, this query slow. Similarly, HELP VIEW returns only the column names, not the data types.
Option #3: SELECT * FROM DBC.ColumnsV;
Similar to HELP TABLE, we can retrieve column metadata for tables, but not for views. For views, only column names, not column data types, are available. But querying this table is pretty fast.
Option #4: ResultSet.getMetadata() from query SELECT * FROM <dbname>.<tablename> SAMPLE 0;
Instead of getting column metadata from Connection.getMetadata().getColumns(), We can run an actual query on the table or view with no rows returned and retrieve that metadata from the ResultSet object instead. This works for both tables and views. The query can be modified to SELECT TOP 1 * FROM <dbname>.<tablename> WHERE 0=1; and performs about the same. The drawback is that this method is still somewhat slow.
Option #5: CREATE VOLATILE TABLE <vtname> AS (SELECT * FROM <dbname>.<tablename>) WITH NO DATA; HELP TABLE <vtname>;
Because HELP TABLE is very fast but works only for tables, and not views, create a VOLATILE TABLE based on the view but with no rows and run HELP TABLE on it. This method is still slow, for both the table creation and HELP TABLE.
Option #6: CREATE TABLE <dbname>.<tmpname> AS (SELECT * FROM <dbname>.<tablename>) WITH NO DATA; HELP TABLE <dbname>.<tmpname>;
Similar to the previous option, but create an actual transactional table. This method is just as slow, and has the drawback that we need permissions to create the table.
Option #7: SELECT TYPE(<col1>), TYPE(<col2>), ... FROM <dbname>.<tablename> SAMPLE 1;
This requires grabbing an actual row from the table or view. The data type is returned, but we don't know the nullability (which isn't really my concern). This method is still slow.
Comparing all the options, I have the following empirical performance metrics for a view with 2000 columns, in a database with thousands of tables.
Original JDBC Metadata: 31.7s
#1 HELP TABLE/VIEW: 0.2s (does not return view column data types)
#2 HELP COLUMN: 3.1s
#3 DBC.ColumnsV: 0.2s (does not return view column data types)
#4 ResultSet.getMetadata(): 1.2s
#5 VOLATILE TABLE: 14.5s
#6 transactional TABLE: 14.5s
#7 SELECT TYPE(): 13.8s
My question: Is there a faster way to query view column metadata? For now, I will go with Option #4: ResultSet.getMetadata(), but it is still slower than HELP TABLE and it feels like a hack to have to query SELECT * FROM <tablename> SAMPLE 0; just to obtain this metadata.
I tried looking at Teradata Studio Express (TSE) to see what it does. For some reason, TSE is able to query view column metadata quite quickly, so I tried capturing the queries that it generates. The first query it generates looks at DBC.ColumnsV, but because there is no view column metadata, it then queries SELECT * FROM <tablename>; and somehow obtains view column metadata quite quickly. Are these JDBC calls, or is there some hidden API? I tried using WireShark to sniff the packets, but I'm not able to figure this out.
Any help would be greatly appreciated. Thanks!
Solved! Go to Solution.
Wonderful! Though it's still not as fast as I'd like it to be, this solution was significantly faster than the other solutions I had tried. In my informal testing, it takes about 75% of the time that my ResultSetMetaData solution took. Thanks for the suggestion!