I have a table where i am storing user SQL queries and I have a process which reads these SQL queries and submits the SQL to teradat thru BTEQ export one after the other. the output from the SQL query is loaded to teradata table. I faced a problem with one of the SQL which extracted huge amount of data to a file the used up all the space on the disc causing all the other processes fail. I want to know how much data my SQL query is going to return and decide upon executing or notifying the users. each of these SQLs will return different number of columns as of now. any suggetion where I can use in BTEQ or FASTEXPORT will help.
Using BTEQ you can find out how many rows are in a result set by using the 'activitycount' variable. However, i think you only get this AFTER the result rows have been returned, so it's probably not going to help you.
I don't think you can have this sort of control using BTEQ. If you wrote your export program in a programming language (C, java, vb etc. etc.) then your program should get control information (including the row count) after the result set is built but before any data is exported.
Having said that, if the results of the user queries need to go into a table, why not add a 'create table **bleep** as' (check the full syntax) on the front of the user queries and do it that way? You'll avoid the need to export data to a file (which is what you've described above).
- You may still have a problem with disk space, namely filling up the database space.
One possibility might be to use BTEQ's .RetLimit and .RetCancel commands. If the query result rows are going to a file, you can precede the query with:
.RETLIMIT 1000000000 /* Max number of rows to return. One billion rows is probably too many */
The query will stop returning rows after the RETLIMIT. You just need to pick a number that works with your environment. Then you can use this for all the queries.
Thank you for your response. I am seeing the possibilities for creating a table as you mentioned. Some of the data doesn't have columnnames as of now and they are hardcoded values in the SQL. it is preventing me from trying to create a table or directly inserting into target. Need to see if users can agree for a change.
Thank you for your response.
I am considering this solution as well. But, there is a danger that we may hit the limit and the traget never gets all of the data. if we consider very big number that may cause the space availability issue with the disc. Will consider and discuss this option with my team.
I'm understanding you want to either execute the export or notify users that there would be too many rows. Using BTEQ to do this won't give you the most elegant or performant solution. But if you really don't mind having the DBS execute the SELECT twice as well as using an export, and still want to try out the RETLIMIT approach, here's an example of how it might be done using any currently-supported version of BTEQ ...
.RETLIMIT 1 SELECT * FROM foo; .IF ACTIVITYCOUNT > 100 THEN .GOTO TooManyRows .RETLIMIT * .EXPORT FILE=foo.exp SELECT * FROM foo; .EXPORT RESET .GOTO nextstep .LABEL TooManyRows .REMARK 'Too many rows.' .OS NotifyUsers.sh .LABEL nextstep
And starting with BTEQ 16.20.00.04 you can opt to use an IF/ELSE construct rather than GOTOs...
.RETLIMIT 1 SELECT * FROM foo; .IF ACTIVITYCOUNT <= 100 .EXPORT FILE=foo.exp .RETLIMIT * SELECT * FROM foo; .EXPORT RESET .ELSE .REMARK 'Too many rows.' .OS NotifyUsers.sh .ENDIF
re creating the table: How is the table created when it is loaded from the external copy? How are the columns named then? The same names can be specified in the CREATE TABLE AS column list without changing the query itself. Or do the CREATE TABLE for the target table, then turn the query into an INSERT SELECT.
The CREATE TABLE AS/INSERT SELECT method will be drastically more efficient and faster than exporting and reloading.
As a (distant) second choice, TPT allows one to define an Export operator as the source and a Load operator as the target for a single job. When this is done, the data does not have to land on disk outside the database, it streams directly through from the export to the load. This will avoid your external disk size issue completely. It is still significantly less efficient than just doing CREATE TABLE AS/INSERT SELECT.
We are not creating table using the extracted data. But, we are concerned with only first few columns of the data and they are fixed. We have our target table defined as per these fixed columns and we are using datastage to load the data. Datastage helps us ignore varying number of unwanted columns in the data. I have to see if my users will be flexible to submit the sqls with fixed list of columns to have a desing change with create/insert. Thank you so much for your input.
One more thought on the .RETLIMIT
Add this as parameter in your SQL (or the table where you store the SQL.
All queries get a fairly low default value.
For those queries that need a higher limit, the user can request this along with the request for additional space.
It prevents surprises, as you have experienced.