Teradata Optimization needed on query.

Analytics
Enthusiast

Teradata Optimization needed on query.

Hi All,

Below is the explain plan for the query which needs tobe optimized. Sorry could not attach it due to some security problems. Do let me know which part of the explain needs to be modified to optimize the query.
Explanation
1) First, we lock aedwprd1.membership_fact_ for access, we lock
aedwprd1.src_sys_dim_ for access, we lock
aedwprd1.prod_var_ernng_cmpnt_dim_ for access, we lock
aedwprd1.mbrshp_covg_mth_strt_dim_ for access, we lock
aedwprd1.mbrshp_acct_mth_dim_ for access, we lock
aedwprd1.firm_dim_ for access, and we lock aedwprd1.chrtfld_dim_
for access.
2) Next, we do an all-AMPs RETRIEVE step from
aedwprd1.mbrshp_covg_mth_strt_dim_ by way of an all-rows scan with
" a condition of (""(aedwprd1.mbrshp_covg_mth_strt_dim_.yr_mth_nbr <="
200810.) AND (aedwprd1.mbrshp_covg_mth_strt_dim_.yr_mth_nbr >=
" 200801.)"") into Spool 4 (all_amps) (compressed columns allowed),"
which is duplicated on all AMPs. Then we do a SORT to order Spool
4 by row hash. The size of Spool 4 is estimated with low
confidence to be 2,310 rows. The estimated time for this step is
0.00 seconds.
3) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of
an all-rows scan, which is joined to
aedwprd1.membership_fact_ by way of a traversal of index # 12
without accessing the base table extracting row ids only.
Spool 4 and aedwprd1.membership_fact_ are joined using a
" nested join, with a join condition of (""dt_dt ="
" aedwprd1.membership_fact_.covg_mth_strt_dt""). The input"
table aedwprd1.membership_fact_ will not be cached in memory.
The result goes into Spool 5 (all_amps), which is built
locally on the AMPs. Then we do a SORT to order Spool 5 by
field Id 1. The size of Spool 5 is estimated with low
confidence to be 7,149,465 rows. The estimated time for this
step is 2.59 seconds.
2) We do an all-AMPs RETRIEVE step from aedwprd1.src_sys_dim_ by
way of an all-rows scan with no residual conditions into
Spool 6 (all_amps) (compressed columns allowed), which is
duplicated on all AMPs. The size of Spool 6 is estimated
with high confidence to be 4,830 rows. The estimated time
for this step is 0.01 seconds.
4) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of
an all-rows scan, which is joined to
aedwprd1.membership_fact_ by way of an all-rows scan with no
residual conditions. Spool 5 and aedwprd1.membership_fact_
are joined using a row id join, with a join condition of (
" ""(1=1)""). The input table aedwprd1.membership_fact_ will not"
be cached in memory. The result goes into Spool 7 (all_amps)
(compressed columns allowed), which is built locally on the
AMPs. The size of Spool 7 is estimated with low confidence
to be 7,149,465 rows. The estimated time for this step is
2.79 seconds.
2) We do an all-AMPs RETRIEVE step from
aedwprd1.mbrshp_acct_mth_dim_ by way of an all-rows scan with
no residual conditions into Spool 8 (all_amps) (compressed
columns allowed), which is duplicated on all AMPs. The size
of Spool 8 is estimated with high confidence to be 149,730
rows. The estimated time for this step is 0.10 seconds.
3) We do an all-AMPs RETRIEVE step from
aedwprd1.prod_var_ernng_cmpnt_dim_ by way of an all-rows scan
with no residual conditions into Spool 9 (all_amps)
(compressed columns allowed), which is duplicated on all AMPs.
Then we do a SORT to order Spool 9 by row hash. The size of
Spool 9 is estimated with high confidence to be 12,754,560
rows. The estimated time for this step is 1.47 seconds.
5) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an
all-rows scan, which is joined to Spool 7 (Last Use) by way of an
all-rows scan. Spool 6 and Spool 7 are joined using a product
" join, with a join condition of (""src_sys_key = src_sys_key""). The"
result goes into Spool 10 (all_amps) (compressed columns allowed),
which is built locally on the AMPs. Then we do a SORT to order
Spool 10 by row hash. The size of Spool 10 is estimated with low
confidence to be 7,149,465 rows. The estimated time for this step
is 1.45 seconds.
6) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of
a RowHash match scan, which is joined to Spool 10 (Last Use)
by way of a RowHash match scan. Spool 9 and Spool 10 are
joined using a merge join, with a join condition of (
" ""prod_var_ec_key = prod_var_ec_key""). The result goes into"
Spool 11 (all_amps) (compressed columns allowed), which is
redistributed by hash code to all AMPs. Then we do a SORT to
order Spool 11 by row hash. The size of Spool 11 is
estimated with low confidence to be 7,149,465 rows. The
estimated time for this step is 3.78 seconds.
2) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of
an all-rows scan, which is joined to aedwprd1.chrtfld_dim_ by
way of an all-rows scan with a condition of (
" ""((aedwprd1.chrtfld_dim_.cf_covg_cd = '100') OR"
((aedwprd1.chrtfld_dim_.cf_covg_cd = '130') OR
(aedwprd1.chrtfld_dim_.cf_covg_cd = '140'))) AND
(((aedwprd1.chrtfld_dim_.cf_fndg_arr_cd = '10') OR
(aedwprd1.chrtfld_dim_.cf_fndg_arr_cd = '50')) AND
" (aedwprd1.chrtfld_dim_.cf_bus_unit_cd = '01800'))""). Spool 8"
and aedwprd1.chrtfld_dim_ are joined using a product join,
" with a join condition of (""(1=1)""). The result goes into"
Spool 12 (all_amps) (compressed columns allowed), which is
redistributed by hash code to all AMPs. Then we do a SORT to
order Spool 12 by row hash. The size of Spool 12 is
estimated with high confidence to be 10,072,551 rows. The
estimated time for this step is 1.46 seconds.
7) We do an all-AMPs JOIN step from aedwprd1.firm_dim_ by way of a
RowHash match scan with no residual conditions, which is joined to
Spool 11 (Last Use) by way of a RowHash match scan.
aedwprd1.firm_dim_ and Spool 11 are joined using a merge join,
" with a join condition of (""aedwprd1.firm_dim_.firm_key = firm_key""). "
The result goes into Spool 13 (all_amps) (compressed columns
allowed), which is redistributed by hash code to all AMPs. Then
we do a SORT to order Spool 13 by row hash. The size of Spool 13
is estimated with low confidence to be 7,149,465 rows. The
estimated time for this step is 1.47 seconds.
8) We do an all-AMPs JOIN step from Spool 12 (Last Use) by way of a
RowHash match scan, which is joined to Spool 13 (Last Use) by way
of a RowHash match scan. Spool 12 and Spool 13 are joined using a
" merge join, with a join condition of (""(cf_key = cf_key) AND"
" (dt_dt = acct_mth_dt)""). The result goes into Spool 3 (all_amps)"
(compressed columns allowed), which is built locally on the AMPs.
The size of Spool 3 is estimated with low confidence to be
7,149,465 rows. The estimated time for this step is 0.60 seconds.
9) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 14. The size of Spool 14 is estimated with no confidence
to be 5,362,099 rows. The estimated time for this step is 4.75
seconds.
10) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with no confidence
to be 5,362,099 rows. The estimated time for this step is 0.37
seconds.
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 17.80 seconds.
4 REPLIES
Enthusiast

Re: Teradata Optimization needed on query.

One way to avoid a product join is to supply a connecting term between the tables where the operator of the term is =. (These terms are called BIND TERMS.)
Sometimes product join is not unfavorable too.
Check for any Between clauses in the query.
Above all have the statistics refreshed.

Enthusiast

Re: Teradata Optimization needed on query.

Thanks for your reply.
We have collected stats on all the columns of the table.
Yes we have a between clause in the condition. Can you suggest what could be done to optimize the query?
Also could you elaborate on BIND TERMS.
Below is the from clause along with the conditions :
FROM
aedwprod.SERV_ACCT_MTH_DIM,
aedwprod.SERV_INCUR_DT_DIM,
aedwprod.SERV_CF_DIM,
aedwprod.FIRM_DIM,
aedwprod.SERV_COC_DIM,
aedwprod.SERV_TPA_DIM,
aedwprod.SERV_PROD_VAR_EC_DIM,
aedwprod.SERV_FACT,
aedwprod.SERV_REVNU_DIM,
aedwprod.SERV_CLM_DIM,
IBNR_STAGE.STG_PLACE_SERVICE PLCOFSER
WHERE
( aedwprod.SERV_COC_DIM.COC_KEY=aedwprod.SERV_FACT.COC_KEY )
AND ( aedwprod.SERV_CLM_DIM.CLM_KEY=aedwprod.SERV_FACT.CLM_KEY )
AND ( aedwprod.SERV_INCUR_DT_DIM.DT_DT =aedwprod.SERV_FACT.INCUR_DT )
AND ( aedwprod.SERV_REVNU_DIM.REVNU_KEY=aedwprod.SERV_FACT.REVNU_KEY )
AND ( aedwprod.SERV_TPA_DIM.TPA_KEY=aedwprod.SERV_FACT.TPA_KEY )
AND ( aedwprod.SERV_CF_DIM.CF_KEY=aedwprod.SERV_FACT.CF_KEY )
AND ( aedwprod.SERV_FACT.PROD_VAR_EC_KEY=aedwprod.SERV_PROD_VAR_EC_DIM.PROD_VAR_EC_KEY )
AND ( aedwprod.FIRM_DIM.FIRM_KEY=aedwprod.SERV_FACT.FIRM_KEY )
AND
aedwprod.SERV_FACT.ACCT_MTH_DT BETWEEN 1081001
AND 1081001
AND aedwprod.SERV_CF_DIM.CF_JURIS_CD IN ('CT000', 'NJ000', 'NY000')
AND aedwprod.SERV_FACT.ZERO_BAL_IND = 'N'
AND PLCOFSER.PLACE_SERV_DESC =
CASE
WHEN aedwprod.SERV_COC_DIM.RPT_TYP_C = 'UNKNOWN' THEN ''
WHEN aedwprod.SERV_COC_DIM.RPT_TYP_C = '*' THEN ''
ELSE
aedwprod.SERV_COC_DIM.RPT_TYP_C
END
Enthusiast

Re: Teradata Optimization needed on query.

Other then stat collection U can do these things to optimize u'r query.

1) make value ordered SI on SERV_FACT.ACCT_MTH_DT.

2) Before applying or implementating point 2 analyze the load strategy for the tables.

for all join condition try to implement JI for known and frequent join condition.

3) Also analyze your query for multi level PPI.

try these solutions and lets see what will happen.
Enthusiast

Re: Teradata Optimization needed on query.

You are not mentioning any Join condition between tables and this is causing a Product Join. Make all the conditions in WHERE a part of Join. The Optimiser is making a choice by itself for the better path and it is a PROD JOIN.