Incremental Planning and Execution

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Enthusiast

Traditionally, query optimizers depend on information available at optimization time such as statistics, cost parameters, predicate values, and resource availability in order to perform query transformations and optimization. The final plan, referred to as a static plan, for a request is chosen by computing the cost of each possible plan variation and selecting the least costly plan. During this process, the optimizer assumes that all the information is accurate and generates the plan for the entire request (a request can consist of multiple statements/queries). However, this assumption may not be true all the time, particularly for complex queries.

The incremental planning and execution (IPE) is an adaptive query optimization technique designed as part of the Teradata optimizer itself. The intent of IPE is to improve the performance of the complex queries. The general idea of this framework is as follows:

  1. The request is fragmented into smaller pieces referred to as request fragments. The request fragments are planned and executed incrementally.
  2. The statistical summary information and/or actual results from the execution of the intermediate request fragments is provided as feedback to the planning of remainder of the plan.

The plan generated by IPE is referred to as a dynamic plan. A dynamic plan can result in a more optimal overall plan providing out-of-the-box system performance. IPE can provide feedback to the optimizer such as statistical summary information from the execution of a request fragment that produces an intermediate spool table (see the supported cases later in the write-up) which helps the cost-based optimizer get the best possible plan. It can also provide feedback of the actual results for certain class of intermediate fragments (see the supported cases later in the write-up). The optimizer uses the feedback from results generated earlier in the plan enabling many powerful Teradata optimizations such as partition elimination and sparse join index qualification, and also many query rewrites such as deriving new predicates through transitive closure, predicates simplification, and unsatisfiability checking.

IPE provides results feedback from Teradata 14.10 for the following cases:

  1. Noncorrelated scalar subqueries (subqueries connected via a scalar operator such as EQUALITY, LESS THAN, GREATER THAN that result in a single-row and single-field).
  2. Tables with single-row access via a unique primary index (UPI) or unique-secondary index (USI).

IPE is enhanced to provide statistics feedback from Teradata 15.10 for the following cases.

  1. Remote tables (tables from Hadoop, Aster, Teradata, etc.) to optimize QueryGrid workload.
  2. Table Operators (not enabled by default; can be enabled using DBSControl).

IPE is targeted for complex queries. Queries which are estimated to be executed more than a minute are considered for IPE. One exception to this rule is the queries referencing remote tables – IPE is always triggered for this kind of queries whenever the optimizer determines the statistics feedback from the remote tables is useful for query optimization.

For all requests, a static plan is generated first. Then a determination is made as to whether to execute the static plan or to use IPE and execute a dynamic plan. Information gathered while generating the static plan is used as part of this determination.

EXPLAIN Modifier

The EXPLAIN Modifier is enhanced to generate dynamic explains. The following is the description of the enhanced EXPLAIN modifier.

[STATIC] EXPLAIN <request>: By default, you’ll get a STATIC EXPLAIN. But this plan may not be the one that is executed. In the explain header, it states the status of IPE as shown in the examples below. Only the most common ones shown here. Headers that are not shown here are self-explanatory.

Request Status

Explain Header

Not eligible for IPE

No header displayed in the static explain.

Eligible for IPE

This request is eligible for incremental planning and execution (IPE). The following is the static plan for the request.

Eligible for IPE but does not meet cost thresholds

This request is eligible for incremental planning and execution (IPE) but does not meet cost thresholds. The following is the static plan for the request.

DYNAMIC EXPLAIN <request>: Displays a dynamic plan if IPE eligible; otherwise static plan. In order for the optimizer to generate a dynamic explain, all fragments but last must be executed on the AMPs just as they would be if the query itself were running. Since a dynamic explain has to execute the intermediate plan fragments, it consumes unexpected system resources (CPU/IO) and also places required locks. For this reason, dynamic explains should be used primarily for troubleshooting query performance issues, not as a default for all explains. Column values obtained from the feedback of the intermediate results are redacted in the dynamic explain for security reasons.

For both a static and dynamic explain, the actual plan when a request is executed may differ due to changes in the information available to the optimizer. The next two sections discuss capturing the actual plan used during execution.

DBQL

DBQL is enhanced to capture dynamic plans. DBQL EXPLAIN captures the dynamic plan if IPE is used; otherwise, the static plan. For dynamic plans, the default setting “redact” intermediate result values. DBQLogTbl.NumFragments indicates the number of plan fragments. This field is NULL if a static plan is executed. Reviewing the NumFragments field will tell you if IPE was used on a given query or not, or the frequency of it being used within a given application or workload. DBQLogStepTbl.FragmentNum indicates the plan fragment number to which a step belongs.

PM/API, Open API, QryState, and Viewpoint

PM/API, Open API, QryState, and Viewpoint, return the dynamic plan generated (a partial plan if executing intermediate an plan fragment or a complete plan if the last fragment is being exeucuted) up to the point when the request is made.

  • Dynamic plan indicator returned.
  • Partial plan indicator returned.

Query State tracked by QrySessn and PM/API can interleave between parser and AMPs when an IPE request is executing.

  • From Parser -> Active -> Parser …

TASM with IPE

TASM applies rules (Workload filters, throttles, and classification criteria) on the static plan. TASM exceptions are done on execution of the dynamic plan. Plan fragments are executed using the workload definition (WD) determined for static plan. Workload exceptions based on accumulated runtime metrics collected during execution of the plan fragments. Rules based on specific steps should be avoided or considered carefully since the plan actually executed using IPE may differ.

In addition, from Teradata 15.0, TASM offers an IPE classification option that allows queries that are being executed as IPE queries to be identified to TASM.  This offers the administrator the option of managing such queries differently from non-IPE queries.

USAGE

This feature is automatic. In other words, the optimizer automatically determines applicability and eligibility of the dynamic plans. If the query is applicable for IPE but not eligible due to cost thresholds (by default queries whose estimated execution time is less than 1 min are eligible for IPE), you can use a query band to control this feature as shown below:

For system determined eligibility which is the default:

SET QUERY_BAND = 'DynamicPlan=SYSTEM;' FOR SESSION;

To bypass cost thresholds when IPE is applicable for the query:

SET QUERY_BAND = 'DynamicPlan=SYSTEMX;' FOR SESSION;

To turn off IPE in the current session for all subsequent requests:

SET QUERY_BAND = 'DynamicPlan=OFF;' FOR SESSION;

The following examples demonstrate these optimizations. Note that these examples are simplified to keep the explain plans short and to demonstrate the IPE capabilities. Also, these explain plans have been forced to use IPE plans as they are short running queries and don’t meet the default cost thresholds.

Example 1: This example is with scalar subqueries which are generally used to retrieve a single value used in a predicate when the actual value is not known until it is retrieved by the subquery.

Query:

SELECT * FROM t100k_a WHERE i1 < (SELECT MAX(i2) FROM t5_a);

Request Fragments:

1: SELECT MAX(i2) INTO v1 FROM t5_a;

2: SELECT * FROM t100k_a WHERE i1 < v1;

Static Explain:

EXPLAIN SELECT * FROM t100k_a WHERE i1<(SELECT MAX(i2) FROM t5_a);

Explanation

--------------------------------------------------------------------------

 This request is eligible for incremental planning and execution (IPE).

 The following is the static plan for the request.

  1) First, we lock OPTBASE_DB1.t5_a for read on a reserved RowHash to

     prevent global deadlock.

  2) Next, we lock OPTBASE_DB1.t100k_a for read on a reserved RowHash

     in all partitions to prevent global deadlock.

  3) We lock OPTBASE_DB1.t5_a for read, and we lock OPTBASE_DB1.t100k_a

     for read.

  4) We do an all-AMPs SUM step to aggregate from OPTBASE_DB1.t5_a by

     way of a traversal of index # 4 without accessing the base table

     with no residual conditions.  Aggregate Intermediate Results are

     computed globally, then placed in Spool 4.  The size of Spool 4 is

     estimated with high confidence to be 1 row (19 bytes).  The

     estimated time for this step is 0.11 seconds.

  5) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of

     an all-rows scan into Spool 2 (all_amps), which is built locally

     on the AMPs.  The size of Spool 2 is estimated with high

     confidence to be 1 row (17 bytes).  The estimated time for this

     step is 0.03 seconds.

  6) We do an all-AMPs DISPATCHER RETRIEVE step from Spool 2 (Last Use)

     by way of an all-rows scan and send the rows back to the

     Dispatcher.  The size is estimated with high confidence to be 1

     row.  The estimated time for this step is 0.03 seconds.

  7) We do an all-AMPs RETRIEVE step from OPTBASE_DB1.t100k_a by way of

     an all-rows scan with a condition of ("OPTBASE_DB1.t100k_a.I1 <

     :%SSQ25") into Spool 1 (group_amps), which is built locally on the

     AMPs.  The size of Spool 1 is estimated with no confidence to be

     33,334 rows (
23,433,802 bytes).  The estimated time for this step

     is 3.09 seconds.

  8) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 1 are sent back to the user as the result of

     statement 1.  The total estimated time is 3.25 seconds.

Dynamic Explain:

DYNAMIC EXPLAIN SELECT * FROM t100k_a WHERE i1< (SELECT MAX(i2) FROM t5_a);

Explanation

--------------------------------------------------------------------------

 The following is the dynamic plan for the request.

  1) First, we lock OPTBASE_DB1.t5_a for read on a reserved RowHash to

     prevent global deadlock.

  2) Next, we lock OPTBASE_DB1.t100k_a for read on a reserved RowHash

     in all partitions to prevent global deadlock.

  3) We lock OPTBASE_DB1.t5_a for read, and we lock OPTBASE_DB1.t100k_a

     for read.

  4) We do an all-AMPs SUM step to aggregate from OPTBASE_DB1.t5_a by

     way of a traversal of index # 4 without accessing the base table

     with no residual conditions.  Aggregate Intermediate Results are

     computed globally, then placed in Spool 4.  The size of Spool 4 is

     estimated with high confidence to be 1 row (19 bytes).  The

     estimated time for this step is 0.11 seconds.

  5) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of

     an all-rows scan into Spool 2 (all_amps), which is built locally

     on the AMPs.  The size of Spool 2 is estimated with high

     confidence to be 1 row (17 bytes).  The estimated time for this

     step is 0.03 seconds.

  6) We do an all-AMPs FEEDBACK RETRIEVE step from Spool 2 (Last Use).

     The estimated time for this step is 0.03 seconds.  The actual size

     of Spool 2 (Last Use) is 1 row (512 bytes).

  7) We send an END PLAN FRAGMENT step for plan fragment 1.

  8) We do an all-AMPs RETRIEVE step from a single partition of

     OPTBASE_DB1.t100k_a with a condition of ("OPTBASE_DB1.t100k_a.I1

     <= :*") into Spool 1 (group_amps), which is built locally on the

     AMPs.  The size of Spool 1 is estimated with high confidence to be

     4 rows (2,812 bytes).  The estimated time for this step is 0.05

     seconds.

  9) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 1 are sent back to the user as the result of

     statement 1.

Notes:

In the dynamic plan, the SSQ is executed in the first fragment and the result is plugged into the main query. With the feedback, the retrieve step from the large table t100k_a has partition elimination and also has a good estimate with high confidence.

Example 2: This example has a single-row relation (a table with a guaranteed zero/single-row output for its single-table criteria/filters) that are used in general to retrieve a row from a calendar table or a control table based on equality predicate on UPI or USI. 

Query:

  SELECT fact.*

  FROM t100k_a fact, ctrl_tbl ctrl

  WHERE fact.i1 between ctrl.i3 and ctrl.i2

   AND ctrl.i1 = 4 /*Condition on UPI of ctrl table*/

Request Fragments:

1: SELECT i3 INTO v1, i2 INTO v2 FROM ctrl_tbl ctrl

     WHERE ctrl.i1 = 4 /*Condition on UPI of ctrl table*/

2: SELECT fact.*  FROM t100k_a fact

     WHERE i1 BETWEEN v1 and v2;

Static Explain:

EXPLAIN

SELECT  fact.* FROM t100k_a fact, ctrl_tbl ctrl

WHERE fact.i1 between ctrl.i3 and ctrl.i2

    AND ctrl.i1 = 4 /*Condition on UPI of ctrl table*/;

Explanation

-------------------------------------------------------------------------

 This request is eligible for incremental planning and execution (IPE).

 The following is the static plan for the request.

  1) First, we lock OPTBASE_DB1.fact for read on a reserved RowHash in

     all partitions to prevent global deadlock.

  2) Next, we lock OPTBASE_DB1.fact for read.

  3) We do a single-AMP RETRIEVE step from OPTBASE_DB1.ctrl by way of

     the unique primary index "OPTBASE_DB1.ctrl.I1 = 4" with no

     residual conditions into Spool 2 (all_amps) (compressed columns

     allowed), which is duplicated on all AMPs.  The size of Spool 2 is

     estimated with high confidence to be 4 rows (84 bytes).  The

     estimated time for this step is 0.05 seconds.

  4) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an

     all-rows scan, which is joined to OPTBASE_DB1.fact by way of an

     all-rows scan with no residual conditions.  Spool 2 and

     OPTBASE_DB1.fact are joined using a product join, with a join

     condition of ("(OPTBASE_DB1.fact.I1 <= I3) AND

     (OPTBASE_DB1.fact.I1 >= I2)").  The result goes into Spool 1

     (group_amps), which is built locally on the AMPs.  The size of

     Spool 1 is estimated with low confidence to be 1 row (703 bytes).

     The estimated time for this step is 2.06 seconds.

  5) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 1 are sent back to the user as the result of

     statement 1.  The total estimated time is 2.11 seconds.

Dynamic Explain:

DYNAMIC EXPLAIN

SELECT  fact.* FROM t100k_a fact, ctrl_tbl ctrl

WHERE fact.i1 between ctrl.i3 and ctrl.i2

    AND ctrl.i1 = 4;

Explanation

-------------------------------------------------------------------------

 The following is the dynamic plan for the request.

  1) First, we lock OPTBASE_DB1.fact for read on a reserved RowHash in

     all partitions to prevent global deadlock.

  2) Next, we lock OPTBASE_DB1.fact for read.

  3) We do a single-AMP RETRIEVE step from OPTBASE_DB1.ctrl by way of

     the unique primary index "OPTBASE_DB1.ctrl.I1 = 4" with no

     residual conditions into Spool 3 (one-amp), which is built locally

     on that AMP.  The size of Spool 3 is estimated with high

     confidence to be 1 row (21 bytes).  The estimated time for this

     step is 0.02 seconds.

  4) We do a single-AMP FEEDBACK RETRIEVE step from Spool 3 (Last Use).

     The estimated time for this step is 0.03 seconds.  The actual size

     of Spool 3 (Last Use) is 1 row (512 bytes).

  5) We send an END PLAN FRAGMENT step for plan fragment 1.

  6) We do a single-AMP RETRIEVE step from a single partition of

     OPTBASE_DB1.fact by way of the unique primary index

     "OPTBASE_DB1.fact.I1 = 3" extracting row ids only with no residual

     conditions into Spool 2 (group_amps), which is built locally on

     that AMP.  The size of Spool 2 is estimated with high confidence

     to be 1 row.  The estimated time for this step is 0.01 seconds.

  7) We do a single-AMP RETRIEVE step from a single partition of

     OPTBASE_DB1.fact by way of the unique primary index

     "OPTBASE_DB1.fact.I1 = 4" extracting row ids only with no residual

     conditions into Spool 2 (group_amps), which is built locally on

     that AMP.  The size of Spool 2 is estimated with high confidence

     to be 2 rows.  The estimated time for this step is 0.01 seconds.

  8) We do a group-AMP SORT to order Spool 2 (group_amps) by row id

     eliminating duplicate rows.  The estimated time for this step is

     0.00 seconds.

  9) We do a group-AMP RETRIEVE step from OPTBASE_DB1.fact by way of

     row ids from Spool 2 (Last Use) with no residual conditions into

     Spool 1 (group_amps), which is built locally on that AMP.  The

     size of Spool 1 is estimated with high confidence to be 2 rows (

     1,406 bytes).  The estimated time for this step is 0.05 seconds.

 10) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 1 are sent back to the user as the result of

     statement 1.

Notes: In the dynamic plain, “ctrl” table with single-row access via a UPI is recognized and the corresponding row is retrieved, plugged-in to the main query and the join with this table is eliminated. With the feedback, the retrieve step from the large table t100k_a is changed from all-row scan to a single-AMP PI based access path and also has a good estimate with high confidence.

Example 3: This example demonstrates IPE capabilities with the combination of statistics feedback from a remote table and results feedback from two scalar subqueries. The following query returns the sum of the sales for each customer in the first quarter of 1998. 

Query:

SELECT c_name, SUM(o_totalprice) 

FROM tpcd.ordertbl@TDRemoteServer,tpcd.customer

WHERE o_orderdate BETWEEN

    (SELECT MIN(calendar_date)

     FROM Sys_Calendar.CALENDAR

     WHERE quarter_of_year = 1 AND year_of_calendar = 1998)

AND (SELECT MAX(calendar_date)

     FROM Sys_Calendar.CALENDAR

        WHERE quarter_of_year = 1 AND year_of_calendar = 1998)

AND c_custkey = o_custkey

GROUP BY 1;   

Request Fragments:

1: SELECT MIN(calendar_date) INTO v1

   FROM Sys_Calendar.CALENDAR

   WHERE quarter_of_year = 1 AND year_of_calendar = 1998;

   SELECT MAX(calendar_date) INTO v2

   FROM Sys_Calendar.CALENDAR

   WHERE quarter_of_year = 1 AND year_of_calendar = 1998;

2: SELECT o_custkey, o_totalprice

      INTO Spool1 /*With dynamic stats collected on o_custkey*/

   FROM tpcd.ordertbl@TDRemoteServer

   WHERE o_orderdate BETWEEN v1 and v2;

3: SELECT c_name, SUM(Spool1.o_totalprice)

   FROM Spool1, tpcd.customer

   WHERE c_custkey = o_custkey

   GROUP BY 1;

Static Explain:

EXPLAIN

SELECT c_name, sum(o_totalprice) 

FROM tpcd.ordertbl@TDRemoteServer, tpcd.CUSTOMER

WHERE o_orderdate BETWEEN

    (SELECT MIN(calendar_date)

     FROM Sys_Calendar.CALENDAR

     WHERE quarter_of_year = 1 AND year_of_calendar = 1998)

AND (SELECT MAX(calendar_date)

     FROM Sys_Calendar.CALENDAR

        WHERE quarter_of_year = 1 AND year_of_calendar = 1998)

AND C_CUSTKEY = O_CUSTKEY

GROUP BY 1;         

Explanation

--------------------------------------------------------------------------

 This request is eligible for incremental planning and execution (IPE).

 The following is the static plan for the request.

  1) First, we lock tpcd.CUSTOMER for read on a reserved RowHash to

     prevent global deadlock.

  2) Next, we lock SYS_CALENDAR.CALDATES for read on a reserved RowHash

     to prevent global deadlock.

  3) We lock tpcd.CUSTOMER for read, and we lock SYS_CALENDAR.CALDATES

     for read.

  4) We do an all-AMPs SUM step to aggregate from SYS_CALENDAR.CALDATES

     by way of an all-rows scan with a condition of (

     "((TD_SYSFNLIB.YearNumber_Of_Calendar (

     SYS_CALENDAR.CALDATES.cdate, 'TERADATA'(VARCHAR(128), CHARACTER

     SET LATIN, NOT CASESPECIFIC)))= 1998) AND

     ((TD_SYSFNLIB.QuarterNumber_Of_Year (SYS_CALENDAR.CALDATES.cdate,

     'TERADATA'(VARCHAR(128), CHARACTER SET LATIN, NOT CASESPECIFIC)))=

     1)").  Aggregate Intermediate Results are computed globally, then

     placed in Spool 7.  The size of Spool 7 is estimated with high

     confidence to be 1 row (19 bytes).  The estimated time for this

     step is 0.16 seconds.

  5) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by way of

     an all-rows scan into Spool 3 (all_amps), which is built locally

     on the AMPs.  The size of Spool 3 is estimated with high

     confidence to be 1 row (17 bytes).  The estimated time for this

     step is 0.03 seconds.

  6) We do an all-AMPs DISPATCHER RETRIEVE step from Spool 3 (Last Use)

     by way of an all-rows scan and send the rows back to the

     Dispatcher.  The size is estimated with high confidence to be 1

     row.  The estimated time for this step is 0.03 seconds.

  7) We do an all-AMPs SUM step to aggregate from SYS_CALENDAR.CALDATES

     by way of an all-rows scan with a condition of (

     "((TD_SYSFNLIB.YearNumber_Of_Calendar (

     SYS_CALENDAR.CALDATES.cdate, 'TERADATA'(VARCHAR(128), CHARACTER

     SET LATIN, NOT CASESPECIFIC)))= 1998) AND

     ((TD_SYSFNLIB.QuarterNumber_Of_Year (SYS_CALENDAR.CALDATES.cdate,

     'TERADATA'(VARCHAR(128), CHARACTER SET LATIN, NOT CASESPECIFIC)))=

     1)").  Aggregate Intermediate Results are computed globally, then

     placed in Spool 10.  The size of Spool 10 is estimated with high

     confidence to be 1 row (19 bytes).  The estimated time for this

     step is 0.16 seconds.

  8) We do an all-AMPs RETRIEVE step from Spool 10 (Last Use) by way of

     an all-rows scan into Spool 4 (all_amps), which is built locally

     on the AMPs.  The size of Spool 4 is estimated with high

     confidence to be 1 row (17 bytes).  The estimated time for this

     step is 0.03 seconds.

  9) We do an all-AMPs DISPATCHER RETRIEVE step from Spool 4 (Last Use)

     by way of an all-rows scan and send the rows back to the

     Dispatcher.  The size is estimated with high confidence to be 1

     row.  The estimated time for this step is 0.03 seconds.

 10) We do an all-AMPs RETRIEVE step executing table operator

     SYSLIB.load_from_td with a condition of ("(NOT (ordertbl.O_CUSTKEY

     IS NULL )) AND ((ordertbl.O_ORDERDATE <= :%SSQ26) AND

     (ordertbl.O_ORDERDATE >= :%SSQ25 ))").

     < BEGIN EXPLAIN FOR REMOTE QUERY -->

     1) First, we lock tpcd.ordertbl for read on a reserved RowHash in

     all partitions to prevent global deadlock. 2) Next, we lock

     tpcd.ordertbl for read. 3) We do an all-AMPs RETRIEVE step from

     tpcd.ordertbl by way of an all-rows scan with no residual

     conditions into Spool 1 (group_amps), which is built locally on

     the AMPs. The input table will not be cached in memory, but it is

     eligible for synchronized scanning. The size of Spool 1 is

     estimated with high confidence to be 6,000,000 rows (222,000,000

     bytes). The estimated time for this step is 52.97 seconds.

     4) Finally, we send out an END TRANSACTION step to all AMPs

     involved in processing the request. -> The contents of Spool 1 are

     sent back to the user as theresult of statement 1. The total

     estimated time is 52.97 seconds.

     <-- END EXPLAIN FOR REMOTE QUERY >

     The size of Spool 1 is estimated with low confidence to be 4 rows

     (148 bytes).  The estimated time for this step is 0.06 seconds.

 11) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of

     an all-rows scan with a condition of ("(ordertbl.O_ORDERDATE >=

     :%SSQ25) AND ((ordertbl.O_ORDERDATE <= :%SSQ26) AND (NOT

     (ordertbl.O_CUSTKEY IS NULL )))") into Spool 12 (all_amps), which

     is redistributed by hash code to all AMPs to all AMPs.  Then we do

     a SORT to order Spool 12 by row hash.  The size of Spool 12 is

     estimated with low confidence to be 4 rows (100 bytes).  The

     estimated time for this step is 0.03 seconds.

 12) We do an all-AMPs JOIN step from tpcd.CUSTOMER by way of a RowHash

     match scan with no residual conditions, which is joined to Spool

     12 (Last Use) by way of a RowHash match scan.  tpcd.CUSTOMER and

     Spool 12 are joined using a merge join, with a join condition of (

     "tpcd.CUSTOMER.C_CUSTKEY = O_CUSTKEY").  The result goes into

     Spool 5 (all_amps), which is built locally on the AMPs.  The size

     of Spool 5 is estimated with index join confidence to be 4 rows (

     132 bytes).  The estimated time for this step is 0.11 seconds.

 13) We do an all-AMPs SUM step to aggregate from Spool 5 (Last Use) by

     way of an all-rows scan, and the grouping identifier in field 2.

     Aggregate Intermediate Results are computed globally, then placed

     in Spool 14.  The size of Spool 14 is estimated with low

     confidence to be 4 rows (164 bytes).  The estimated time for this

     step is 0.11 seconds.

 14) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by way of

     an all-rows scan into Spool 2 (group_amps), which is built locally

     on the AMPs.  The size of Spool 2 is estimated with low confidence

     to be 4 rows (252 bytes).  The estimated time for this step is

     0.08 seconds.

 15) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 2 are sent back to the user as the result of

     statement 1.  The total estimated time is 0.84 seconds.

Dynamic Explain:

DYNAMIC EXPLAIN

SELECT c_name, sum(o_totalprice) 

FROM tpcd.ordertbl@TDRemoteServer, tpcd.CUSTOMER

WHERE o_orderdate BETWEEN

    (SELECT MIN(calendar_date)

     FROM Sys_Calendar.CALENDAR

     WHERE quarter_of_year = 1 AND year_of_calendar = 1998)

AND (SELECT MAX(calendar_date)

     FROM Sys_Calendar.CALENDAR

        WHERE quarter_of_year = 1 AND year_of_calendar = 1998)

AND C_CUSTKEY = O_CUSTKEY

GROUP BY 1;   

Explanation

-------------------------------------------------------------------------

 The following is the dynamic plan for the request.

  1) First, we lock tpcd.CUSTOMER for read on a reserved RowHash to

     prevent global deadlock.

  2) Next, we lock SYS_CALENDAR.CALDATES for read on a reserved RowHash

     to prevent global deadlock.

  3) We lock tpcd.CUSTOMER for read, and we lock SYS_CALENDAR.CALDATES

     for read.

  4) We do an all-AMPs SUM step to aggregate from SYS_CALENDAR.CALDATES

     by way of an all-rows scan with a condition of (

     "((TD_SYSFNLIB.YearNumber_Of_Calendar (

     SYS_CALENDAR.CALDATES.cdate, 'TERADATA'(VARCHAR(128), CHARACTER

     SET LATIN, NOT CASESPECIFIC)))= 1998) AND

     ((TD_SYSFNLIB.QuarterNumber_Of_Year (SYS_CALENDAR.CALDATES.cdate,

     'TERADATA'(VARCHAR(128), CHARACTER SET LATIN, NOT CASESPECIFIC)))=

     1)").  Aggregate Intermediate Results are computed globally, then

     placed in Spool 7.  The size of Spool 7 is estimated with high

     confidence to be 1 row (19 bytes).  The estimated time for this

     step is 0.16 seconds.

  5) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by way of

     an all-rows scan into Spool 3 (all_amps), which is built locally

     on the AMPs.  The size of Spool 3 is estimated with high

     confidence to be 1 row (17 bytes).  The estimated time for this

     step is 0.03 seconds.

  6) We do an all-AMPs FEEDBACK RETRIEVE step from Spool 3 (Last Use).

     The estimated time for this step is 0.03 seconds.  The actual size

     of Spool 3 (Last Use) is 1 row (512 bytes).

  7) We do an all-AMPs SUM step to aggregate from SYS_CALENDAR.CALDATES

     by way of an all-rows scan with a condition of (

     "((TD_SYSFNLIB.YearNumber_Of_Calendar (

     SYS_CALENDAR.CALDATES.cdate, 'TERADATA'(VARCHAR(128), CHARACTER

     SET LATIN, NOT CASESPECIFIC)))= 1998) AND

     ((TD_SYSFNLIB.QuarterNumber_Of_Year (SYS_CALENDAR.CALDATES.cdate,

     'TERADATA'(VARCHAR(128), CHARACTER SET LATIN, NOT CASESPECIFIC)))=

     1)").  Aggregate Intermediate Results are computed globally, then

     placed in Spool 10.  The size of Spool 10 is estimated with high

     confidence to be 1 row (19 bytes).  The estimated time for this

     step is 0.16 seconds.

  8) We do an all-AMPs RETRIEVE step from Spool 10 (Last Use) by way of

     an all-rows scan into Spool 4 (all_amps), which is built locally

     on the AMPs.  The size of Spool 4 is estimated with high

     confidence to be 1 row (17 bytes).  The estimated time for this

     step is 0.03 seconds.

  9) We do an all-AMPs FEEDBACK RETRIEVE step from Spool 4 (Last Use).

     The estimated time for this step is 0.03 seconds.  The actual size

     of Spool 4 (Last Use) is 1 row (512 bytes).

 10) We send an END PLAN FRAGMENT step for plan fragment 1.

 11) We do an all-AMPs RETRIEVE step executing table operator

     SYSLIB.load_from_td with a condition of ("(NOT (ordertbl.O_CUSTKEY

     IS NULL )) AND ((ordertbl.O_ORDERDATE >= :*) AND

     (ordertbl.O_ORDERDATE <= :*))").  Spool 1 sample (50%) statistics

     go into Spool 16.  The size of Spool 1 is estimated with low

     confidence to be 4 rows (148 bytes).  The estimated time for this

     step is 0.06 seconds.

 12) We do an all-AMPs FEEDBACK RETRIEVE step from Spool 16 (Last Use)

     of Spool 1 statistics.  The estimated time for this step is 0.03

     seconds.  The actual size of Spool 1 is 225,337 rows (12,618,872

     bytes).

 13) We send an END PLAN FRAGMENT step for plan fragment 2.

 14) We execute the following steps in parallel.

      1) We do an all-AMPs RETRIEVE step from tpcd.CUSTOMER by way of

         an all-rows scan with no residual conditions into Spool 12

         (all_amps) fanned out into 2 hash join partitions, which is

         built locally on the AMPs.  The size of Spool 12 is estimated

         with high confidence to be 600,000 rows (16,200,000 bytes).

         The estimated time for this step is 6.81 seconds.

      2) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way

         of an all-rows scan with a condition of (

         "((ordertbl.O_ORDERDATE >= :*) AND (ordertbl.O_ORDERDATE <=

         :*)) AND (NOT (ordertbl.O_CUSTKEY IS NULL ))") into Spool 17

         (all_amps) fanned out into 2 hash join partitions, which is

         redistributed by hash code to all AMPs to all AMPs.  The size

         of Spool 17 is estimated with high confidence to be 225,337

         rows (5,633,425 bytes).  The estimated time for this step is

         2.67 seconds.

 15) We do an all-AMPs JOIN step from Spool 12 (Last Use) by way of an

     all-rows scan, which is joined to Spool 17 (Last Use) by way of an

     all-rows scan.  Spool 12 and Spool 17 are joined using a hash join

     of 2 partitions, with a join condition of ("C_CUSTKEY = O_CUSTKEY").

     The result goes into Spool 5 (all_amps), which is built locally on

     the AMPs.  The size of Spool 5 is estimated with low confidence to

     be 225,337 rows (7,436,121 bytes).  The estimated time for this

     step is 1.92 seconds.

 16) We do an all-AMPs SUM step to aggregate from Spool 5 (Last Use) by

     way of an all-rows scan, and the grouping identifier in field 2.

     Aggregate Intermediate Results are computed globally, then placed

     in Spool 14.  The size of Spool 14 is estimated with low

     confidence to be 225,337 rows (9,238,817 bytes).  The estimated

     time for this step is 2.03 seconds.

 17) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by way of

     an all-rows scan into Spool 2 (group_amps), which is built locally

     on the AMPs.  The size of Spool 2 is estimated with low confidence

     to be 225,337 rows (14,196,231 bytes).  The estimated time for

     this step is 1.05 seconds.

 18) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 2 are sent back to the user as the result of

     statement 1.

Notes: In the first fragment, both scalar subqueries are executed and their results are fed back to the optimizer in the form of min and max value. This results in translating the BETWEEN criteria on the remote table OrderTable@TDRemoteServer to two date constants. In the second fragment, the remote table is read by pushing the BETWEEN criteria to the remote system. While the rows are being returned to the local system, dynamic statistics such as row count and summary statistics on column o_custkey are gathered automatically and fed back to the optimizer. This dynamic feedback corrected the row count estimate of the remote table from 4 rows to 225,337 rows. The non-optimal plan with merge join in the static plan is optimized to a 2-partition hash join.

Example 4: This example demonstrates IPE capabilities with the statistics feedback of a remote table from Hadoop server. The following query returns the number of clicks for each item by joining the “items” table to the remote “clicks_history” table saved in Hadoop server.

Query:

SELECT i_name, count(*)

FROM items,

  (SELECT item_ID FROM FOREIGN TABLE

    (SELECT * FROM qgrd3000g.clicks_history

     WHERE dt_click_date

     BETWEEN DATE '2013-01-01' AND DATE '2014-06-30')@tdsqlhhive AS HDT)


  AS HDT2

WHERE item_ID = i_item_ID

GROUP BY i_name;    

Request Fragments:

1: SELECT item_ID FROM qgrd3000g.clicks_history

   INTO Spool 4

   WHERE dt_click_date

   BETWEEN DATE '2013-01-01' AND DATE '2014-06-30')@tdsqlhhive ;

2: SELECT i_name, count(*)

   FROM items, Spool4 /*With dynamic stats collected on item_ID */

   WHERE item_ID = i_item_ID

   GROUP BY i_name;

Static Explain:

EXPLAIN

SELECT i_name, count(*)

FROM items,

  (SELECT item_ID FROM FOREIGN TABLE

    (SELECT * FROM qgrd3000g.clicks_history

     WHERE dt_click_date

     BETWEEN DATE '2013-01-01' AND DATE '2014-06-30')@tdsqlhhive AS HDT)

  AS HDT2

WHERE item_ID = i_item_ID

GROUP BY i_name;    

Explanation

--------------------------------------------------------------------------

  This request is eligible for incremental planning and execution (IPE).

  The following is the static plan for the request.     

  1) First, we lock QGRD3000G.items for read on a reserved RowHash to

     prevent global deadlock.

  2) Next, we lock QGRD3000G.items for read.

  3) We do an all-AMPs RETRIEVE step executing table operator

     SYSLIB.load_from_hcatalog_hdp2_3_0 with a condition of ("(1=1)").

     The size of Spool 2 is estimated with low confidence to be 160

     rows (3,426,240 bytes).  The estimated time for this step is 0.02

     seconds.

  4) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of

     an all-rows scan into Spool 3 (used to materialize view, derived

     table, table function or table operator TblOpInputSpool)

     (all_amps), which is redistributed by hash code to all AMPs to all

     AMPs.  The size of Spool 3 is estimated with low confidence to be

     160 rows (3,426,240 bytes).  The estimated time for this step is

     0.02 seconds.

  5) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of

     an all-rows scan executing table operator

     SYSLIB.load_from_hcatalog_hdp2_3_0 with a condition of ("NOT

     (HDT.ITEM_ID IS NULL)") into Spool 4 (used to materialize view,

     derived table, table function or table operator HDT) (all_amps),

     which is built locally on the AMPs.  The size of Spool 4 is

     estimated with low confidence to be 160 rows (5,760 bytes).  The

     estimated time for this step is 0.02 seconds.

  6) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of

     an all-rows scan with a condition of ("NOT (HDT.ITEM_ID IS NULL)")

     into Spool 8 (all_amps), which is duplicated on all AMPs.  The

     size of Spool 8 is estimated with low confidence to be 25,600 rows

     (742,400 bytes).  The estimated time for this step is 0.02 seconds.

  7) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an

     all-rows scan, which is joined to QGRD3000G.items by way of an

     all-rows scan with no residual conditions.  Spool 8 and

     QGRD3000G.items are joined using a dynamic hash join, with a join

     condition of ("(ITEM_ID (FLOAT, FORMAT '-9.99999999999999E-999'))=

     (QGRD3000G.items.I_ITEM_ID)").  The result goes into Spool 7

     (all_amps), which is built locally on the AMPs.  The size of Spool

     7 is estimated with index join confidence to be 160 rows (3,840

     bytes).  The estimated time for this step is 0.04 seconds.

  8) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by

     way of an all-rows scan, and the grouping identifier in field 2.

     Aggregate Intermediate Results are computed globally, then placed

     in Spool 9.  The size of Spool 9 is estimated with no confidence

     to be 120 rows
(4,680 bytes).  The estimated time for this step is

     0.04 seconds.

  9) We do an all-AMPs RETRIEVE step from Spool 9 (Last Use) by way of

     an all-rows scan into Spool 5 (group_amps), which is built locally

     on the AMPs.  The size of Spool 5 is estimated with no confidence

     to be 120 rows
(4,920 bytes).  The estimated time for this step is

     0.02 seconds.

 10) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 5 are sent back to the user as the result of

     statement 1.  The total estimated time is 0.19 seconds.

Dynamic Explain:

DYNAMIC EXPLAIN

SELECT i_name, count(*)

FROM items,

  (SELECT item_ID FROM FOREIGN TABLE

    (SELECT * FROM qgrd3000g.clicks_history

     WHERE dt_click_date

     BETWEEN DATE '2013-01-01' AND DATE '2014-06-30')@tdsqlhhive AS HDT)


  AS HDT2

WHERE item_ID = i_item_ID

GROUP BY i_name;    

Explanation

-------------------------------------------------------------------------

 The following is the dynamic plan for the request.

  1) First, we lock QGRD3000G.items for read on a reserved RowHash to

     prevent global deadlock.

  2) Next, we lock QGRD3000G.items for read.

  3) We do an all-AMPs RETRIEVE step executing table operator

     SYSLIB.load_from_hcatalog_hdp2_3_0 with a condition of ("(1=1)").

     The size of Spool 2 is estimated with low confidence to be 160

     rows (3,426,240 bytes).  The estimated time for this step is 0.02

     seconds.

  4) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of

     an all-rows scan into Spool 3 (used to materialize view, derived

     table, table function or table operator TblOpInputSpool)

     (all_amps), which is redistributed by hash code to all AMPs to all

     AMPs.  The size of Spool 3 is estimated with low confidence to be

     160 rows (3,426,240 bytes).  The estimated time for this step is

     0.02 seconds.

  5) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of

     an all-rows scan executing table operator

     SYSLIB.load_from_hcatalog_hdp2_3_0 with a condition of ("NOT

     (HDT.ITEM_ID IS NULL)") into Spool 4 (all_amps), which is built

     locally on the AMPs.  Spool 4 statistics go into Spool 5.  The

     size of Spool 4 is estimated with low confidence to be 160 rows (

     5,760 bytes).  The estimated time for this step is 0.02 seconds.

  6) We do an all-AMPs FEEDBACK RETRIEVE step from Spool 5 (Last Use)

     of Spool 4 statistics.  The size is estimated with high confidence

     to be 1 row.  The estimated time for this step is 0.01 seconds.

     The actual size of Spool 4 is 3,885,962,505 rows (256,473,525,330

     bytes).

  7) We send an END PLAN FRAGMENT step for plan fragment 1.

  8) We do an all-AMPs RETRIEVE step from QGRD3000G.items by way of an

     all-rows scan with no residual conditions into Spool 10 (all_amps),

     which is duplicated on all AMPs.  The size of Spool 10 is

     estimated with high confidence to be 1,020,480 rows (27,552,960

     bytes).  The estimated time for this step is 0.04 seconds.

  9) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of an

     all-rows scan, which is joined to Spool 4 (Last Use) by way of an

     all-rows scan with a condition of ("NOT (HDT.ITEM_ID IS NULL)").

     Spool 10 and Spool 4 are joined using a dynamic hash join, with a

     join condition of ("(ITEM_ID (FLOAT, FORMAT

     '-9.99999999999999E-999'))= (I_ITEM_ID)").  The result goes into

     Spool 9 (all_amps), which is built locally on the AMPs.  The size

     of Spool 9 is estimated with low confidence to be 3,885,962,505

     rows
(93,263,100,120 bytes).  The estimated time for this step is

     1 minute and 49 seconds.

 10) We do an all-AMPs SUM step to aggregate from Spool 9 (Last Use) by

     way of an all-rows scan, and the grouping identifier in field 2.

     Aggregate Intermediate Results are computed globally, then placed

     in Spool 11.  The size of Spool 11 is estimated with low confidence

     to be 6,378 rows (216,852 bytes).  The estimated

     time for this step is 21 minutes and 23 seconds.

 11) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of

     an all-rows scan into Spool 7 (group_amps), which is built locally

     on the AMPs.  The size of Spool 7 is estimated with low confidence

     to be 6,378 rows (261,498 bytes).  The estimated

     time for this step is 1 minute and 26 seconds.

 12) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 7 are sent back to the user as the result of

     statement 1.

Notes: Static plan assumed that remote table returns 160 rows and duplicated this table to do the join with “items” table. However, dynamic plan with statistics feedback corrected the row count estimate to 3,885,962,505 rows and optimized the plan by keeping this table local and duplicated the “items” table to do the hash join.

Terminology

The following terminology is used when discussing IPE.

  • Request Fragment: Portion of a request for which a plan is generated and executed.
  • Plan Fragment: Series of one or more steps generated for a request fragment.
  • Static Plan: Plan generated for a request based on static information such as collected statistics, AMP samples, etc.
  • Dynamic Plan: Plan (consisting of plan fragments) generated incrementally using dynamic feedback of results or statistical information from intermediate spools.
  • IPE – Incremental Planning and Execution: Framework to produce and execute dynamic plans.

Recommendations and Considerations

In general, IPE applicability and eligibility are automatic and no user-intervention required. In some scenarios, queries can be made IPE eligible with small changes to the SQL as shown in below examples.

  1. If you have an IN or NOT IN subquery which is guaranteed to return no more than one row, re-code it as an scalar subquery. This helps the optimizer detect the single-row subquery and make the query eligible for IPE.

For example, a query such as SELECT * FROM fact WHERE x1 IN (SELECT max(x2) FROM calendar_tbl) can be re-coded as SELECT * FROM fact WHERE x1 = (SELECT max(x2) FROM calendar_tbl) since the subquery with aggregation and no GROUOP BY is guaranteed to return a single-row.

  1. If the FROM clause has a table that is guaranteed to return a maximum of one row (for example, from a calendar table or, a row-level security table), provide a single-row access path with its UPI or USI. This helps the optimizer detect the single-row access and make the query eligible for IPE.

For example, a single row control table ctrl_tbl(x1 INT) can be re-coded as ctrl_tbl(key INT, x1 INT) UNIQUE PRIMARY INDEX(key) with a wrapper view ctrl_tbl_v1 AS SELECT * FROM ctrl_tbl WHERE key = <key_val>; This way when the main query uses the ctrl_tbl_v1, it would always have single-row access that makes it eligible for results feedback.

  1. Dynamic statistics feedback for remote tables is collected while the rows are being imported from the remote table which includes the row count and also the summary of the statistics information such as number of nulls, approximate number of distinct values, high-mode frequency, etc. for the columns of interest. The overhead in collecting these statistics is very minimal as it is collected while rows are being written to the spool after applying single-table filters without any additional read or aggregation steps.
  2. For repeated requests (parameterized or non-parameterized) that qualify for IPE, the optimizer goes for a dynamic plan first (marked as specific plan), a generic plan for the second occurrence and compares the execution time from first and second plans. The winning plan is used from the third occurrence. If a dynamic plan is needed all the time, queryband can be used to force this choice. However, note that dynamic plans are not cacheable which requires parsing and optimization every time they are submitted.
  3. Queries are qualified for IPE only when the cost threshold (estimated execution cost is great than 1 minute) and parse time thresholds (estimated parse time is less than 10% of the estimated execution time) meet the eligibility criteria. While these default settings work for majority of the scenarios, you can bypass the thresholds or turn off IPE using the above mentioned query band for special scenarios that you may have or for experimentation.
1 Comment
Enthusiast
QueryBand                NumFragments Elapsed time   EstProcTime AMPCPUTime ParserCPUTime TotalIOCount SpoolUsage
=S> DynamicPlan=SYSTEMX  2       0:00:00.620000 0.036   0.844   0.092   2,750   1,024
=S> DynamicPlan=SYSTEM   ?       0:00:03.280000 5.158   56.14   0.084   122,725   1,024

Hi Rama

We are on a TD 15.0 system, with 10 nodes (6700), and are just trying out IPE settings.  We have seen from a few tests, that the cost threshold seems a bit restrictive when it comes to applying IPE.  From the example above, the first query is with the IPE queryband override and uses DPE, the second has the default setting with doesn't use IPE and as a result DPE is not used.  The query is excluded from IPE as the EstProcTime is less than 60 seconds.  Even though the query without IPE takes only 5 seconds, it uses a significant amount of CPU and IO, so for the query to take 60 seconds, that could use a large amount of CPU and IO, with virtually no significant change to Parser time.  I can confirm that when the query ran with a very large table, IPE was used by default without the queryband override.  I was wondering if you can explain why the threshold is set to 60 seconds ?

Cheers

Steven