Use queue table to manage asynchronous process

General
N/A

Use queue table to manage asynchronous process

Earlier last year, there was a post on this forum asking loop function in BTEQ, this post will also show a trick how to "mimic" a loop using BTEQ.

Recently, I encountered a particular issue in our environment. A process using parallel BTEQ scripts to process data and target to the same destination. Generally, there will be multiple insert/update/delete in each script targeting at one main target table. At first glance, there should be no issue, since RDBMS manage locks and will leave the session in "blocked" state and serve the request FIFO.

However, it will bring challenge for Workload management. This particular job falls into the main batch load work group, which has a SQL limit 14 during the batch window. If this application want to run 6 parallel bteqs, then at certain point, there could be 5 sessions in block state, while 1 is in active. For TDWM, block is equal to active, then it will leave 8 slots open for all other batch processes. From throughput perspective, this is a waste. Moreover, the target table is the largest table on the system (that's why use multiple bteq to process data to increase throughput), the insert/update/delete steps are usually 10-20 minutes each. Leaving the TDWM slots occupied by blocked session is NOT fair to other processes. One might say: allocate this job into its own workload. Well, it can quickly leads to too many WD to manage.

Therefore, a better solution is needed to give developer to control the process at a better granularity.

To design this process, I am trying to keep things simple and reusable. (dealing with existing production job, you do not want to make dramatic changes).

Idea is simple: whenever the job needs to do insert/update/delete, insert a row into a work queue table first, then check whether the top row is the one that inserted by the session. If not, then wait for a while and check agian, otherwise remove the row and move on to the next step.

Teradata offers queue table naturally fit this requirement. However, BTEQ only allows GOTO a label by moving forward in the script. This present some challenge which requires some trick in the coding. It should be fairly easy using perl/java/c#, but I am trying to deal with existing batch job, cannot introduce dramatic changes or complex new components. Here is the solution.

1. Create a queue table as following:

CREATE TABLE WORKQUEUE1, QUEUE (
QITS TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP(6) NOT NULL,
SESSID INTEGER,
COMMENT VARCHAR(30),
CHECK_COUNTER SMALLINT DEFAULT 0
) PRIMARY INDEX (SSID);

2. Main logic. File name wait.bteq
.REMARK "Sanity check.. does my session exist in the queue?"
SELECT * FROM WORKQUEUE1 WHERE SESSID = SESSION;
.IF ACTIVITYCOUNT = 0 THEN .GOTO NOSESSIONERROR

.REMARK "Sanity check is done. move to check top row"
LOCK WORKQUEUE1 FOR READ
SELECT SESSID
FROM ( SELECT TOP 1 * FROM WORKQUEUE1 ORDER BY QITS ASC)
WHERE SESSID = SESSION;
--- Please make sure you have ORDER BY QITS ASC , otherwise, the top 1 * return is not deterministic
--- I know it sounds like a bug, I just discovered that too. Will need to confirm with teradata
-- if you do not want to use top 1, you can do SELECT * FROM WORKQUEUE1 WHERE QITS = (SELECT MIN(QITS) FROM WORKQUEUE1)
--- I tested both, and both works stable
.IF ACTIVITYCOUNT = 0 THEN .GOTO WAITAGAIN

.REMARK "Looks like top row is me. Branch to done step"
.GOTO IAMDONE

.LABEL WAITAGAIN

-- Update the check counter ++, this feature prevent infinite loop
UPDATE WORKQUEUE1
SET CHECK_COUNTER = CHECK_COUNTER + 1
WHERE SESSID = SESSION;

SELECT SESSID FROM WORKQUEUE1 WHERE SESSID=SESSION AND CHECK-COUNTER > 5 ; --- I am only try to wait 5 cycles
.IF ACTIVITYCOUNT > 0 THEN .GOTO MAXTRYERROR

.REMARK "Enter wait stage 120 seconds"
.HANG 120
.RUN wait.bteq

.LABEL NOSESSIONERROR
.REMARK "i did not find my session in queue!! Error!"
.QUIT 15

.LABEL MAXTRYERROR
.REMARK "MAX TRY REACHED! ERROR! "

DELETE FROM WORKQUEUE WHERE SESSID = SESSION;

..QUIT 15

.LABEL IAMDONE
.REMARK "Select and consume the row"

.LABEL ENDSCRIPT
.REMARK "Wait ended, return to main caller"

3. Test sample : this will mimic a scenario that separated session block the other
3.1 open sql assistant or bteq
INSRET INTO WORKQUEUE1 (SESSID,COMMENT) VALUES (SESSION, 'ACTIVE SESSION');
3.2 Test bteq script
.RUN FILE mylogon

.set echoreq off

SELECT SESSION;
.REMARK "PREPARING ENTERING THE BLOCK OF INSERT/UPDATE/DELETE TARGET TABLE"
INSERT INTO WORKQUEUE1(SESSID, COMMENT) VALUES (SESSION,'BLOCKED SESSION');
.RUN FILE wait.bteq
.REMARK 'Now I am at the top of the queue, do the real job"
SELECT * FROM DBC.DBCINFO;
.REMARK "NOW THE REAL JOB IS DONE, CONSUME THE TOP ROW!"
SELECT AND CONSUME TOP 1 * FROM WORKQUEUE1;
.quit
1 REPLY
N/A

Re: Use queue table to manage asynchronous process

correction.. in the wait.bteq
.LABEL iamdone
.REMARK "SELECT AND CONSUME THE ROW AFTER PROCESS COMPLETE"
-- dong nothing here, just a place holder
.LABEL ENDSCRIPT
This endscript label exists if you choose to not to error out in the max try. You can jump here. Also, notice that if you do not want to error out on max try, your mainscript need some change, you cannot use select and consume top 1 * there, more likely, you want to use delete from where sessid = session; that might defeat the purpose of using queue table anyway. Regular table should work, queue table present some benefits since the records are "cached" in PE .
All operations are based on sessid which is the primary index of the table. therefore, if your TDWM is setup like what was suggested in best practice, these sqls are going to the workload single amp group.