Differences between Teradata optimizer and actual SQL performance

Database
Enthusiast

Differences between Teradata optimizer and actual SQL performance

We have a situation where we have a query that the explain plan says should take .56 seconds. The actual query performance ranges from 40 seconds to a couple of minutes. Can anyone make a suggestion as to why there is such a big difference between what the optimizer thinks and what is actually going on? We have collected statistics, etc. We are on TD 12.

Query and explain plan below.

Thanks!

SELECT DISTINCT
PRSN_GLOBAL_V.EPR_GRP_D.EPR_NBR,
PRSN_GLOBAL_V.EPR_GRP_D.EPR_NME
FROM
PRSN_GLOBAL_V.EPR_GRP_D,
PRSN_GLOBAL_V.AGY_CSR_FST_SRV_D
WHERE
( PRSN_GLOBAL_V.AGY_CSR_FST_SRV_D.EPR_PTY_KEY=PRSN_GLOBAL_V.EPR_GRP_D.EPR_PTY_KEY )
ORDER BY
2

1) First, we lock PRSN_MAT_T.EPR_GRP_D in view
PRSN_GLOBAL_V.EPR_GRP_D for access, and we lock
PRSN_MAT_T.AGY_CSR_FST_SRV_D in view
PRSN_GLOBAL_V.AGY_CSR_FST_SRV_D for access.
2) Next, we do an all-AMPs JOIN step from PRSN_MAT_T.EPR_GRP_D in
view PRSN_GLOBAL_V.EPR_GRP_D by way of a RowHash match scan with
no residual conditions, which is joined to
PRSN_MAT_T.AGY_CSR_FST_SRV_D in view
PRSN_GLOBAL_V.AGY_CSR_FST_SRV_D by way of a RowHash match scan
with no residual conditions. PRSN_MAT_T.EPR_GRP_D and
PRSN_MAT_T.AGY_CSR_FST_SRV_D are joined using a merge join, with a
join condition of ("PRSN_MAT_T.AGY_CSR_FST_SRV_D.EPR_PTY_KEY =
PRSN_MAT_T.EPR_GRP_D.EPR_PTY_KEY"). The result goes into Spool 3
(group_amps), which is redistributed by the hash code of (
PRSN_MAT_T.EPR_GRP_D.EPR_NBR, PRSN_MAT_T.EPR_GRP_D.EPR_NME) to all
AMPs. Then we do a SORT to order Spool 3 by the sort key in spool
field1 eliminating duplicate rows. The size of Spool 3 is
estimated with low confidence to be 20,513 rows (4,512,860 bytes).
The estimated time for this step is 0.56 seconds.
3) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 3 are sent back to the user as the result of
statement 1. The total estimated time is 0.56 seconds.
1 REPLY
Enthusiast

Re: Differences between Teradata optimizer and actual SQL performance

Elapsed time or user experienced time is affected by other factors such as TDWM/TASM, network latency, and other activity on the system. I've seen queries fluctuate dramatically from one run to the next based on these factors. A more accurate and consistent way to measure query performance is to look at the CPU utilization of that query. This can be found in DBQL data.

My understanding is that the CPU time in the DBQL tables is supposed to more closely reflect the estimated time in the explain plan, although it's not always a 100% match.