All AMP active for Updates based on primary index

Database
Teradata Employee

Re: All AMP active for Updates based on primary index

If you could rewrite this as a Multi-Statement Request, they would all be row locks (because it is qualified on the PI), and they would all run concurrently.  However I would hope the optimizer could translate such an in-list request (within reason) to the same sort of thing with the same results.

 

A MSR is a single string of DML statements separated by semi-colons, viz.:

UPDATE pingusb.bs_WGPrintQueue SET Status = 'Pending' where WorkorderNumber = 3653925; UPDATE pingusb.bs_WGPrintQueue SET Status = 'Pending' where WorkorderNumber = 365 ; UPDATE pingusb.bs_WGPrintQueue SET Status = 'Pending' where WorkorderNumber = 3518981 ; UPDATE pingusb.bs_WGPrintQueue SET Status = 'Pending' where WorkorderNumber = 3655198 ; UPDATE pingusb.bs_WGPrintQueue SET Status = 'Pending' where WorkorderNumber = 3654818 ; UPDATE pingusb.bs_WGPrintQueue SET Status = 'Pending' where WorkorderNumber = 3654173 ; UPDATE pingusb.bs_WGPrintQueue SET Status = 'Pending' where WorkorderNumber = 3655831 ;

Enthusiast

Re: All AMP active for Updates based on primary index

Exactly!  This is what we will probably end up doing but more work on the developers end.  I just want to rule out using an inlist if it will always issue a table lock even though the explain shows otherwise.  I'm hoping Todd or Dieter can verify this for me from a Teradata optimizer perspective! :)

 

Joe

Teradata Employee

Re: All AMP active for Updates based on primary index

What does the actual request to the database look like (e.g. captured in DBQL) when there are multiple WorkorderNumber values?

Does the application bind a parameter array, dynamically adjust the number of parameter markers in the SQL text, or what?

 

The database is sometimes able to do additional query rewrites or optimizations when you provide literal values, versus a "similar" query with parameters. While you may think of it as being the "same" SQL, the optimizer doesn't see it that way. Perhaps the application could dynamically construct the query text as a string with literal values in place of parameter markers.

 

Also, you can get a more representative EXPLAIN for a parameterized query via BTEQ with Teradata USING syntax - something like this:

EXPLAIN USING ( WPID VARCHAR(255), BN VARCHAR(255), 
WN1 VARCHAR(255), WN2 VARCHAR(255))
Update WGPrintQueue Set Status = 'Pending' Where WgProcessId =  :WPID AND BatchNumber =  :BN AND WorkorderNumber in ( :WN1, :WN2 )
;

You should change the data types in the USING clause to reflect the types being bound to the parameters.

 

Enthusiast

Re: All AMP active for Updates based on primary index

Thanks Fred!

 

A similar query captured directly out of DBQL is:

 

160 Update WGPrintQueue Set Status = 'Pending' Where WgProcessId = ? AND BatchNumber = ? And WorkorderNumber in ( ? , ? , ? );

 

The number of WorkorderNumbers will obviously vary by statement.  This one is processing three.  Notice the NumOfActiveAmps value is 160 (all amps) to the left of the update statement.

 

The Explain Using is captured below.  It is also showing a table lock on WGPrintQueue.  This is different than the individual rowhash locks that are displayed when running an explain without the 'Using' in SQLA.  Does the unicode to latin translation make any difference?

 

EXPLAIN USING ( WPID VARCHAR(15), BN INTEGER, WN1 INTEGER, WN2 INTEGER)
Update pingus.WGPrintQueue
Set Status = 'Pending'
Where WgProcessId = :WPID
AND BatchNumber = :BN
AND WorkorderNumber in ( :WN1, :WN2 ) ;
EXPLAIN USING ( WPID VARCHAR(15), BN INTEGER, WN1 INTEGER, WN2 INTEGER)
Update pingus.WGPrintQueue
Set Status = 'Pending'
Where WgProcessId = :WPID
AND BatchNumber = :BN
AND WorkorderNumber in ( :WN1, :WN2 ) ;

*** Help information returned. 18 rows.
*** Total elapsed time was 1 second.

Explanation
---------------------------------------------------------------------------
1) First, we lock a distinct pingusb."pseudo table" for write on a
RowHash to prevent global deadlock for pingusb.bs_WGPrintQueue.
2) Next, we lock pingusb.bs_WGPrintQueue in view pingus.WGPrintQueue
for write.
3) We do an all-AMPs UPDATE from pingusb.bs_WGPrintQueue in view
pingus.WGPrintQueue by way of an all-rows scan with a condition of
("(pingusb.bs_WGPrintQueue in view pingus.WGPrintQueue.WGProcessId
= (TRANSLATE((:WPID )USING UNICODE_TO_LATIN))) AND
((pingusb.bs_WGPrintQueue in view pingus.WGPrintQueue.BatchNumber
= :BN) AND ((pingusb.bs_WGPrintQueue in view
pingus.WGPrintQueue.WorkOrderNumber = :WN1) OR
(pingusb.bs_WGPrintQueue.WorkOrderNumber = :WN2 )))"). The size
is estimated with low confidence to be 2 rows. The estimated time
for this step is 0.05 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
The total estimated time is 0.05 seconds.

 

====================================================================

Am I correct in my understanding that, in this case, value substitution will produce a table lock while simply executing the statement, as is, in SQL Assistant will produce individual rowhash locks?  If so, would you mind explaining as this is confusing to me.  If an inlist of PI rowhash values will always produce a table lock, why does the explain in SQLA show rowhash locks?  Is it not ultimately the same statement being parsed?  Not sure why passing parameter values should make any difference...

 

Thank you for your time!

 

Joe

Highlighted
Teradata Employee

Re: All AMP active for Updates based on primary index

No, it's actually NOT quite the same query. At parse time, one has literal values that the optimizer can use to make additional decisions, the other has placeholders for the parameters during parsing and the actual values are plugged in at execution time. That can make a difference.

 

Yes, it would be nice if the optimizer recognized that the parameters would allow PI operations instead of a table scan with OR.

 

You can declare WPID VARCHAR(15) CHARACTER SET LATIN to eliminate the TRANSLATE no, you can't.

But I don't think the TRANSLATE makes any difference.

Enthusiast

Re: All AMP active for Updates based on primary index

Thanks Fred!

 

The parser is not able to determine if a rowhash can be granted at parsing time even though the only placeholders are for primary index values alone?  Since the optimizer does not know then a table lock is applied?  Conversely, when I supply the values in SQL the optimizer knows during parsing to place a rowhash lock.  Are those fair statements?

 

In heavy transactional environments, it sounds like paramaterized primary index updates should be avoided at all costs if a table lock is to be avoided.  We will go with the multi-statement option.  This explains why the SQLA explain plan explains what it does.  Not even going to try to explain that sentence!  :)  This will surely be a hot topic of conversation at our next development meeting.

 

Thanks again!

 

Joe

 

Joe

Teradata Employee

Re: All AMP active for Updates based on primary index

If your parameterized request has an equality constraint on a single PI (rather than IN-list), you will get the RowHash lock.

 

Enthusiast

Re: All AMP active for Updates based on primary index

Thank you ... yes.. I have seen that in DBQL.  Looks like the multi-statement update is the way to go.

 

Joe

Junior Contributor

Re: All AMP active for Updates based on primary index

Explain shows 

Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.

 

IMHO when the Update is submitted with multiple values in the IN-list the optimizer knows exactly which AMPs are involved, but when it's a prepared statement this is unknown. For a single value it doesn't matter because there's no extra END TRANS step.

 

So don't use Prepared SQL, just run it directly using Dynamic SQL.