very fast query + very fast query = very slow query ???

Database
Enthusiast

very fast query + very fast query = very slow query ???

I am trying to replicate the following query, which takes under half a second, in a more dynamic way to expand it's usage:

/* hard-coded Inventory Codes */
/* execution time: 0.5 second */
select
r.*

from
our_dw.sales_detail r

where
r.InventoryCode IN ('15KIE01', '15KID00', '15KID01', '15KIB01', '15KIA00', '15KIB02', '15KIC00', '15KIA01', '15KIB00', '15KIC01', '15KIE00')

Fortunately, the query that returns those InventoryCodes itself takes just 0.1 second to run: 

/* query to get inventory codes from description */
/* execution time: 0.1 second */
select InventoryCode from our_dw.inventory_summary where ItemDescription like 'Widget B%'

Therefore, I figured the combination of a 0.5 second query and an 0.1 second query would still be a very fast query. Unfortunately, the following takes 42 seconds:

/* why can't any of the following queries even come close to combining the two above??? */
/* execution time: 41.9 seconds */
select
r.*

from
our_dw.sales_detail r

where
r.InventoryCode in (select InventoryCode from our_dw.inventory_summary where ItemDescription like 'Widget B%')

and every other combination (JOINs, CTE's) takes even longer:

/* execution time: 149.5 seconds */
select
r.*

from
our_dw.sales_detail r inner join
our_dw.inventory_summary s on r.InventoryCode = s.InventoryCode

where
s.ItemDescription like 'Widget B%'

;

/* execution time: 160.9 seconds */
select
r.*

from
our_dw.sales_detail r inner join
(select InventoryCode from our_dw.inventory_summary where ItemDescription like 'Widget B%') s on r.InventoryCode = s.InventoryCode

;

/* execution time: 195.9 seconds */
with the_codes as (select InventoryCode from our_dw.inventory_summary where ItemDescription like 'Widget B%')

select
r.*

from
our_dw.sales_detail r inner join
the_codes c on r.InventoryCode = c.InventoryCode

;

/* execution time: 71.7 seconds */
with the_codes as (select InventoryCode from our_dw.inventory_summary where ItemDescription like 'Widget B%')

select
r.*

from
our_dw.sales_detail r

where
r.InventoryCode in (select InventoryCode from the_codes)

;

Finally, with a temp table, I was able to get the performance I expected, but only if we avoid a JOIN:

create volatile table temp_codes as (
select InventoryCode from our_dw.inventory_summary where ItemDescription like 'Widget B%'
) WITH DATA ON COMMIT PRESERVE ROWS;

/* execution time: 151.8 seconds */
select
r.*

from
our_dw.sales_detail r inner join
temp_codes c on r.InventoryCode = c.InventoryCode

;

/* execution time: 0.5 seconds AHA! */
/* why isn't Teradata smart enough to run ALL the above queries like this ?? */
select
r.*

from
our_dw.sales_detail r

where
r.InventoryCode in (select InventoryCode from temp_codes)

Every one of these queries returns the identical dataset. Why isn't Teradata smart enough to first get the list of InventoryCodes, then execute the rest of the query as if it had the hard-coded list to begin with (i.e. just like it does in the original sample) ?  For the database to take so long must mean it is doing table scans and lookups that aren't necessary.  Can I provide "hints" to help Teradata understand what is going on and to speed things up?

We are on v.14.10.05.04 and using JDBC driver v.15.10.00.09 (I get similar results via SQL Assistant also)

Thank you.

14 REPLIES
Senior Apprentice

Re: very fast query + very fast query = very slow query ???

Can you add some more details, (P)PI & statistics of both tables plus some Explains?

Enthusiast

Re: very fast query + very fast query = very slow query ???

I am not the DBAdmin, just the primary user, I don't have access to detailed stats. But "inventory_summary" has 101,000 rows, "sales_detail" has 30.5 million. Both of the above are views, not direct tables. InventoryCode is a varchar(15).

I can see in the Explains that in the 1st query (with hard-coded InventoryCodes), Teradata goes straight to "sales_detail" and very quickly pulls data back one-by-one based on each InventoryCode (therefore, 19 steps). In the 2nd query above (just obtaining the InventoryCodes, returns in 0.1 seconds), Teradata locks and reads from inventory_summary -- and figures out that it only needs to look in a single partition, and a single InventoryTypeID, and again is very fast. But the 3rd query above (42 seconds), Teradata doesn't appear to look at that view (inventory_summary is never locked for read) -- in fact it seems to work backwards, as Step 4 Part 1 has it retrieving from "sales_detail" before it even knows what InventoryCode's we will eventually be looking for.  Is there some way to tell Teradata to slow down, to do the first 2 steps one at a time and things will turn out much quicker?

1st query above:

 1) First, we lock a distinct edw_tables."pseudo table" for read on a
RowHash to prevent global deadlock for edw_tables.r.
2) Next, we lock a distinct edw_tables."pseudo table" for read on a
RowHash to prevent global deadlock for edw_tables.m.
3) We lock edw_tables.r in view our_dw.sales_detail for read, and we
lock edw_tables.m in view our_dw.sales_detail for read.
4) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KID00'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 196 rows. The estimated time for this step
is 0.01 seconds.
5) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIE01'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 392 rows. The estimated time for this step
is 0.01 seconds.
6) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KID01'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 588 rows. The estimated time for this step
is 0.01 seconds.
7) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIB01'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 784 rows. The estimated time for this step
is 0.01 seconds.
8) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIA00'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 980 rows. The estimated time for this step
is 0.01 seconds.
9) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIB02'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 1,176 rows. The estimated time for this
step is 0.01 seconds.
10) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIC00'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 1,372 rows. The estimated time for this
step is 0.01 seconds.
11) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIA01'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 1,568 rows. The estimated time for this
step is 0.01 seconds.
12) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIB00'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 1,764 rows. The estimated time for this
step is 0.01 seconds.
13) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIC01'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. The size of Spool 4 is estimated with
high confidence to be 1,960 rows. The estimated time for this
step is 0.01 seconds.
14) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of index # 4 "edw_tables.r in view
our_dw.sales_detail.InventoryCode = '15KIE00'" extracting row ids
only with no residual conditions into Spool 4 (all_amps), which is
built locally on the AMPs. Then we do a SORT to order Spool 4 by
row id eliminating duplicate rows. The size of Spool 4 is
estimated with high confidence to be 2,156 rows. The estimated
time for this step is 0.01 seconds.
15) We do an all-AMPs RETRIEVE step from edw_tables.r in view
our_dw.sales_detail by way of row ids from Spool 4 (Last Use)
with a residual condition of ("((edw_tables.r in view
our_dw.sales_detail.InventoryTypeID = 14) OR (edw_tables.r in view
our_dw.sales_detail.InventoryTypeID = 1 )) AND (((edw_tables.r in
view our_dw.sales_detail.InventoryCode )= '15KID00') OR
(((edw_tables.r in view our_dw.sales_detail.InventoryCode )=
'15KIE01') OR (((edw_tables.r in view
our_dw.sales_detail.InventoryCode )= '15KID01') OR (((edw_tables.r in
view our_dw.sales_detail.InventoryCode )= '15KIB01') OR
(((edw_tables.r in view our_dw.sales_detail.InventoryCode )=
'15KIA00') OR (((edw_tables.r in view
our_dw.sales_detail.InventoryCode )= '15KIB02') OR (((edw_tables.r in
view our_dw.sales_detail.InventoryCode )= '15KIC00') OR
(((edw_tables.r in view our_dw.sales_detail.InventoryCode )=
'15KIA01') OR (((edw_tables.r in view
our_dw.sales_detail.InventoryCode )= '15KIB00') OR (((edw_tables.r in
view our_dw.sales_detail.InventoryCode )= '15KIC01') OR
((edw_tables.r in view our_dw.sales_detail.InventoryCode )=
'15KIE00')))))))))))") into Spool 5 (all_amps), which is
redistributed by the rowkey of (edw_tables.r.InventoryCode) to all AMPs.
Then we do a SORT to partition Spool 5 by rowkey. The size of
Spool 5 is estimated with low confidence to be 1,895 rows (
109,910 bytes). The estimated time for this step is 1.16 seconds.
16) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of a
RowHash match scan, which is joined to a single partition of
edw_tables.m in view our_dw.sales_detail by way of a RowHash
match scan with a condition of ("(edw_tables.m in view
our_dw.sales_detail.InventoryTypeID = 14) OR (edw_tables.m in view
our_dw.sales_detail.InventoryTypeID = 1)"). Spool 5 and edw_tables.m
are left outer joined using a rowkey-based merge join, with a join
condition of ("(InventoryTypeID = edw_tables.m.InventoryTypeID) AND
(InventoryCode = edw_tables.m.InventoryCode)"). The result goes into Spool
3 (all_amps), which is built locally on the AMPs. The size of
Spool 3 is estimated with low confidence to be 1,895 rows (
147,810 bytes). The estimated time for this step is 0.01 seconds.
17) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan , grouping by field1 (
edw_tables.r.CustomerID ,edw_tables.r.AccountNumber
,edw_tables.r.InventoryTypeID ,edw_tables.r.StoreID
,edw_tables.r.SalesDate ,edw_tables.r.InventoryCode
,edw_tables.m.InventoryDescription). Aggregate Intermediate Results are
computed globally, then placed in Spool 8. The size of Spool 8 is
estimated with low confidence to be 1,895 rows (348,680 bytes).
The estimated time for this step is 0.03 seconds.
18) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with low confidence
to be 1,895 rows (223,610 bytes). The estimated time for this
step is 0.01 seconds.
19) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
No rows are returned to the user as the result of statement 2.
The total estimated time is 1.29 seconds.

2nd query above:

  1) First, we lock a distinct edw_tables."pseudo table" for read on a
RowHash to prevent global deadlock for edw_tables.sales_master.
2) Next, we lock edw_tables.sales_master in view our_dw.inventory_summary
for read.
3) We do an all-AMPs RETRIEVE step from a single partition of
edw_tables.sales_master in view our_dw.inventory_summary with a
condition of ("edw_tables.sales_master in view
our_dw.inventory_summary.InventoryTypeID = 1") with a residual condition
of ("(edw_tables.sales_master in view
our_dw.inventory_summary.InventoryDescription LIKE 'Widget B%') AND
(edw_tables.sales_master in view our_dw.inventory_summary.InventoryTypeID =
1)") into Spool 1 (group_amps), which is built locally on the AMPs.
The size of Spool 1 is estimated with no confidence to be 91,068
rows (2,549,904 bytes). The estimated time for this step is 0.04
seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
No rows are returned to the user as the result of statement 2.
The total estimated time is 0.04 seconds.

3rd query above:

  1) First, we lock a distinct edw_tables."pseudo table" for read on a
RowHash to prevent global deadlock for edw_tables.r.
2) Next, we lock a distinct edw_tables."pseudo table" for read on a
RowHash to prevent global deadlock for edw_tables.sales_master.
3) We lock edw_tables.r in view our_dw.sales_detail for read, and we
lock edw_tables.sales_master for read.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from 2 partitions of
edw_tables.r in view our_dw.sales_detail with a condition of
("(edw_tables.r in view our_dw.sales_detail.InventoryTypeID = 14)
OR (edw_tables.r in view our_dw.sales_detail.InventoryTypeID = 1)")
into Spool 4 (all_amps) fanned out into 4 hash join
partitions, which is built locally on the AMPs. The input
table will not be cached in memory, but it is eligible for
synchronized scanning. The size of Spool 4 is estimated with
low confidence to be 48,374,730 rows (2,805,734,340 bytes).
The estimated time for this step is 41.26 seconds.
2) We do an all-AMPs RETRIEVE step from a single partition of
edw_tables.sales_master with a condition of (
"edw_tables.sales_master.InventoryTypeID = 1") with a residual
condition of ("(edw_tables.sales_master.InventoryDescription LIKE
'Widget B%') AND (edw_tables.sales_master.InventoryTypeID = 1)")
into Spool 6 (all_amps), which is built locally on the AMPs.
Then we do a SORT to order Spool 6 by the sort key in spool
field1 (edw_tables.sales_master.InventoryCode) eliminating
duplicate rows. The size of Spool 6 is estimated with no
confidence to be 91,068 rows (3,278,448 bytes). The
estimated time for this step is 0.21 seconds.
5) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 6 (Last Use) by
way of an all-rows scan into Spool 5 (all_amps) fanned out
into 4 hash join partitions, which is duplicated on all AMPs.
The size of Spool 5 is estimated with no confidence to be
1,092,816 rows (39,341,376 bytes).
2) We do an all-AMPs RETRIEVE step from a single partition of
edw_tables.m in view our_dw.sales_detail with a condition of
("(edw_tables.m in view our_dw.sales_detail.InventoryTypeID = 14)
OR (edw_tables.m in view our_dw.sales_detail.InventoryTypeID = 1)")
into Spool 7 (all_amps), which is redistributed by the hash
code of (edw_tables.m.InventoryTypeID, edw_tables.m.InventoryCode) to all
AMPs. The size of Spool 7 is estimated with low confidence
to be 101,186 rows (4,249,812 bytes). The estimated time for
this step is 0.11 seconds.
6) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an
all-rows scan, which is joined to Spool 5 (Last Use) by way of an
all-rows scan. Spool 4 and Spool 5 are joined using a inclusion
hash join of 4 partitions, with a join condition of ("InventoryCode =
InventoryCode"). The result goes into Spool 8 (all_amps), which is
redistributed by the hash code of (edw_tables.r.InventoryTypeID,
edw_tables.r.InventoryCode) to all AMPs. The size of Spool 8 is
estimated with no confidence to be 46,103,186 rows (2,673,984,788
bytes). The estimated time for this step is 48.87 seconds.
7) We do an all-AMPs JOIN step from Spool 7 (Last Use) by way of an
all-rows scan, which is joined to Spool 8 (Last Use) by way of an
all-rows scan. Spool 7 and Spool 8 are right outer joined using a
single partition hash join, with condition(s) used for
non-matching on right table ("NOT (InventoryCode IS NULL)"), with a
join condition of ("(InventoryTypeID = InventoryTypeID) AND (InventoryCode =
InventoryCode)"). The result goes into Spool 3 (all_amps), which is
built locally on the AMPs. The size of Spool 3 is estimated with
no confidence to be 46,103,187 rows (3,596,048,586 bytes). The
estimated time for this step is 31.12 seconds.
8) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan , grouping by field1 (
edw_tables.r.CustomerID ,edw_tables.r.AccountNumber
,edw_tables.r.InventoryTypeID ,edw_tables.r.StoreID
,edw_tables.r.SalesDate ,edw_tables.r.InventoryCode
,edw_tables.m.InventoryDescription). Aggregate Intermediate Results are
computed globally, then placed in Spool 11. The aggregate spool
file will not be cached in memory. The size of Spool 11 is
estimated with no confidence to be 34,577,391 rows (6,362,239,944
bytes). The estimated time for this step is 4 minutes and 39
seconds.
9) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with no confidence
to be 34,577,391 rows (4,080,132,138 bytes). The estimated time
for this step is 25.61 seconds.
10) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
No rows are returned to the user as the result of statement 2.
Teradata Employee

Re: very fast query + very fast query = very slow query ???

If you look closely at the step that implements edw_tables.sales_master.InventoryDescription LIKE 'Widget B%', you will see an estimate of "with no confidence to be 91,068 rows". When it has that large estimate, the optimizer builds a plan that is very "safe" in that it will work for any size of a return set without causing really bad performance if it is underestimated. Unfortunately, the safe plan misses the fast plan when the estimate is way over-estimated.

Are there statistics on InventoryDescription? That would give the optimizer some chance of estimating the number of rows selected by the LIKE condition. However, even that is not a guarantee since LIKE is very hard to estimate properly.

When you place the results in the temp table, we get to count the resulting rows before making the rest of the plan so we can use the fast plan when we know only a few rows have been selected.

In addition to collecting stats on InventoryDescription, can you provide the explain plan when you use the temp table?

Enthusiast

Re: very fast query + very fast query = very slow query ???

Thanks Todd, that is interesting. Note that the step you mention (step 4 part 2) is actually the same as query 2, step 3, and that takes a more efficient path (0.04 seconds vs 0.21 seconds). I can live with a few extra milliseconds but it is interesting that one plan is 5x faster.

What seems to me is that Teradata is getting fancy with it's concept of "parallelism" and using the opportunity to retrieve from the final, huge table (sales_detail) "in parallel" with the "LIKE 'Widget B%'" subquery (i.e. Step 4 parts 1 and 2).  This may work conceptually but is horrible in reality -- a 0.21 second query and a 41 second query are not, practically speaking, processing "in parallel" -- when one part is finished the other still has 99+% of the way to go.

Finally, all the steps after #5 just seem like an attempt to fix an execution path gone awry... steps 6 and 7 require JOINs, step 8 does an aggregate sum, all with longer and longer execution time estimates (in reality, they are overstated, the entire query takes 40-60 seconds, but still, way too long).

Below is the execution plan after creating a temp table (and using the IN clause). You're correct that in this case, Teradata recognizes that the row counts will be much smaller.

1) First, we lock a distinct big_edw."pseudo table" for read on a
RowHash to prevent global deadlock for big_edw.r.
2) Next, we lock a distinct big_edw."pseudo table" for read on a
RowHash to prevent global deadlock for big_edw.m.
3) We lock big_edw.r in view our_dw.sales_detail for read, and we
lock big_edw.m in view our_dw.sales_detail for read.
4) We do an all-AMPs RETRIEVE step from JSMITH.temp_codes by way
of an all-rows scan with no residual conditions into Spool 5
(all_amps), which is built locally on the AMPs. Then we do a SORT
to order Spool 5 by the sort key in spool field1 (
JSMITH.temp_codes.InventoryCode) eliminating duplicate rows. The
size of Spool 5 is estimated with high confidence to be 12 rows (
432 bytes). The estimated time for this step is 0.01 seconds.
5) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by way of
an all-rows scan into Spool 4 (all_amps), which is duplicated on
all AMPs. Then we do a SORT to order Spool 4 by the hash code of
(JSMITH.temp_codes.InventoryCode). The size of Spool 4 is
estimated with high confidence to be 144 rows (5,184 bytes).
6) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an
all-rows scan, which is joined to big_edw.r in view
our_dw.sales_detail by way of a traversal of index # 4 without
accessing the base table extracting row ids only. Spool 4 and
big_edw.r are joined using a nested join, with a join condition
of ("big_edw.r.InventoryCode = InventoryCode"). The input table
big_edw.r will not be cached in memory. The result goes into
Spool 6 (all_amps), which is built locally on the AMPs. Then we
do a SORT to order Spool 6 by field Id 1. The size of Spool 6 is
estimated with low confidence to be 6,075 rows (182,250 bytes).
The estimated time for this step is 0.07 seconds.
7) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an
all-rows scan, which is joined to 2 partitions of big_edw.r in
view our_dw.sales_detail with a condition of ("(big_edw.r in
view our_dw.sales_detail.InventoryTypeID = 14) OR (big_edw.r in
view our_dw.sales_detail.InventoryTypeID = 1)"). Spool 6 and
big_edw.r are joined using a row id join, with a join condition
of ("big_edw.r.InventoryCode = InventoryCode"). The input table
big_edw.r will not be cached in memory. The result goes into
Spool 7 (all_amps), which is redistributed by the rowkey of (
big_edw.r.InventoryCode) to all AMPs. Then we do a SORT to partition
Spool 7 by rowkey. The size of Spool 7 is estimated with low
confidence to be 6,075 rows (352,350 bytes). The estimated time
for this step is 3.20 seconds.
8) We do an all-AMPs JOIN step from Spool 7 (Last Use) by way of a
RowHash match scan, which is joined to a single partition of
big_edw.m in view our_dw.sales_detail by way of a RowHash
match scan with a condition of ("(big_edw.m in view
our_dw.sales_detail.InventoryTypeID = 14) OR (big_edw.m in view
our_dw.sales_detail.InventoryTypeID = 1)"). Spool 7 and big_edw.m
are left outer joined using a rowkey-based merge join, with
condition(s) used for non-matching on left table ("NOT (InventoryCode
IS NULL)"), with a join condition of ("(InventoryTypeID =
big_edw.m.InventoryTypeID) AND (InventoryCode = big_edw.m.InventoryCode)").
The result goes into Spool 3 (all_amps), which is built locally on
the AMPs. The size of Spool 3 is estimated with low confidence to
be 6,075 rows (473,850 bytes). The estimated time for this step
is 0.02 seconds.
9) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan , grouping by field1 (
big_edw.r.CustomerID ,big_edw.r.AccountNumber
,big_edw.r.InventoryTypeID ,big_edw.r.StoreID
,big_edw.r.SalesDate ,big_edw.r.InventoryCode
,big_edw.m.InventoryDescription). Aggregate Intermediate Results are
computed globally, then placed in Spool 10. The size of Spool 10
is estimated with low confidence to be 6,075 rows (1,117,800
bytes). The estimated time for this step is 0.03 seconds.
10) We do an all-AMPs RETRIEVE step from Spool 10 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with low confidence
to be 6,075 rows (716,850 bytes). The estimated time for this
step is 0.02 seconds.
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
No rows are returned to the user as the result of statement 2.
Enthusiast

Re: very fast query + very fast query = very slow query ???

Can u try below Query:

select
r.*

from
our_dw.sales_detail r

where
r.InventoryCode in (select InventoryCode from our_dw.inventory_summary where ItemDescription like 'Widget B%' group by 1);
Enthusiast

Re: very fast query + very fast query = very slow query ???

talk2soumya, adding "group by 1" had no effect on execution plan, in reality it added a few seconds to execution time (46 vs 52 seconds)

I do appreciate everyone's input on this.

Enthusiast

Re: very fast query + very fast query = very slow query ???

Please provide the table structures, Statistics information of both the tables. 

Also the explain plan for below 2 SQLs

/* execution time: 151.8 seconds */

select

   r.*

   

from

   our_dw.sales_detail r inner join

      temp_codes c on r.InventoryCode = c.InventoryCode

      

   ;

 

/* execution time: 0.5 seconds AHA! */

/* why isn't Teradata smart enough to run ALL the above queries like this ?? */

select

   r.*

   

from

   our_dw.sales_detail r

   

where

   r.InventoryCode in (select InventoryCode from temp_codes)

Enthusiast

Re: very fast query + very fast query = very slow query ???

kirthi, the execution plan for the version with the temp table is posted in my "Thanks Todd" post.  (Admins: an essential part of a forum is the ability to uniquely identify and link to specific posts!).  I will try to post the other tomorrow.  

All, what commands should I be running to get the statistics you're requesting?

Enthusiast

Re: very fast query + very fast query = very slow query ???

Pls provide the below

Show stats <table_name>

help stats <table_name>

Show table <table_name>