ABNORMAL BEHAVIOR OF QUERY IN TD14.10 after upgrade from TD14

Database
Enthusiast

ABNORMAL BEHAVIOR OF QUERY IN TD14.10 after upgrade from TD14

Hi Forum,

We have upgraded our system from TD14 to TD14.10 (2days back) and facing abnormal behavior of one query.

The query when executed from SQLA, Bteq , TDStudio provided the result in less than 10 seconds.  Whereas, the same query running from front end application (Websphere app) runs for more than 1.5hrs and consumes lots of system resources.

We do not have any control on front end application as it’s a GUI and the client suspects the issue on upgrade.

The only change we have when we done when runing in SQLA or TDStudio is we passed the parameters to the query (some values or null values), whereas the front end app takes the values dynamically.

The query is stuck at 1 point (Product join) and continues to be hanging and the system gets jammed when the app fires more no. of similar queries.

Your help on this would be highly appreciated.

FYI:

Database: TD 14.10

Client JDBC version: 14

Web Sphere app: 8.5.0.2

1). Stats refreshed after upgrade.  2). Stats Dropped and re collected.  3). System restarted to clear cache.  4). Web servers restarted.  In the runtime explain plan of web server fired query, there is huge difference between actual rows and estimated rows

EST. TIME  00:00:03.006

ACTUAL TIME

EST. ROWS

1

ACTUAL ROWS

http://130.143.38.15/SessionsPortlet/images/active.png ACTIVE

We do an All-AMPs JOIN step from Spool 1970 (Last Use) by way of an all-rows scan, which is joined to table CNTCT_MCHNSM. Spool 1970 and table CNTCT_MCHNSM are joined using a product join . The result goes into Spool 1971, which is redistributed by hash code to all AMPs.

   SELECT businesspa0_.BSNSS_PRTNR_ID AS col_0_0_,

    businesspa0_.ACCNT_TYP AS col_1_0_,

    businesspa0_.NM_1 AS col_2_0_,

    businesspa0_.NM_2 AS col_3_0_,

    businesspa0_.NM_3 AS col_4_0_,

    businesspa0_.NM_4 AS col_5_0_,

    businesspa0_.INTRNTNL_NM_1 AS col_6_0_,

    businesspa0_.INTRNTNL_NM_2 AS col_7_0_,

    businesspa0_.INTRNTNL_NM_3 AS col_8_0_,

    businesspa0_.INTRNTNL_NM_4 AS col_9_0_,

    businesspa0_.VT_NMBR AS col_10_0_,

    businesspa0_.LFCYCL AS col_11_0_,

    businesspa0_.CRTD_BY_IN_SRC AS col_12_0_,

    businesspa0_.LST_MDFD_BY_IN_SRC AS col_13_0_,

    businesspa0_.VRSN AS col_14_0_,

    language6_.CD_CL AS col_15_0_,

    contactmec1_.CNTCT_MCHNSM_ID AS col_16_0_,

    contactmec1_.ADDRSS_LN_1 AS col_17_0_,

    contactmec1_.ADDRSS_LN_2 AS col_18_0_,

    contactmec1_.ADDRSS_LN_3 AS col_19_0_,

    contactmec1_.DSTRCT AS col_20_0_,

    contactmec1_.ADDRSS_LN_4 AS col_21_0_,

    contactmec1_.ADDRSS_LN_5 AS col_22_0_,

    contactmec1_.CTY AS col_23_0_,

    contactmec1_.PSTL_CD AS col_24_0_,

    country5_.CD_CL AS col_25_0_,

    country5_.GGRPHC_AR_ID AS col_26_0_,

    contactmec1_.TLPHN_NMBR AS col_27_0_,

    contactmec1_.EML_ADDRSS AS col_28_0_,

    contactmec1_.P_O_BX_NMBR AS col_29_0_,

    contactmec1_.P_O_BX_PSTL_CD AS col_30_0_,

    contactmec1_.P_O_BX_CTY AS col_31_0_,

    contactmec1_.HS_NMBR AS col_32_0_,

    contactmec1_.HS_NMBR_SPPLMNT AS col_33_0_,

    contactmec1_.BLDNG_CD AS col_34_0_,

    contactmec1_.FLR AS col_35_0_,

    contactmec1_.RM AS col_36_0_,

    language6_.VRSN AS col_37_0_,

    contactmec1_.VRSN AS col_38_0_,

    country5_1_.VRSN AS col_39_0_,

    trilliumex3_.WNDW_KY_1 AS col_40_0_,

    trilliumex3_.WNDW_KY_2 AS col_41_0_,

    trilliumex3_.WNDW_KY_3 AS col_42_0_,

    trilliumex3_.WNDW_KY_4 AS col_43_0_,

    trilliumex3_.WNDW_KY_5 AS col_44_0_,

    trilliumex3_.WNDW_KY_6 AS col_45_0_,

    trilliumex3_.DT_QLTY_CD AS col_46_0_,

    trilliumex3_.PRSR_TNNG_CD AS col_47_0_,

    trilliumex3_.CLNSNG_STTS AS col_48_0_,

    trilliumex3_.MTCH_STRT_NM AS col_49_0_,

    trilliumex3_.MTCH_HS_NMBR AS col_50_0_,

    trilliumex3_.MTCH_STRT_TYP AS col_51_0_,

    trilliumex3_.MTCH_BSNSS_NM AS col_52_0_,

    trilliumex3_.MTCH_BLDNG_NM_1 AS col_53_0_,

    trilliumex3_.MTCH_BLDNG_TYP_1 AS col_54_0_,

    trilliumex3_.MTCH_BLDNG_NM_2 AS col_55_0_,

    trilliumex3_.MTCH_BLDNG_TYP_2 AS col_56_0_,

    trilliumex3_.MTCH_APRTMNT_NMBR AS col_57_0_,

    usagetype4_.TYP AS col_58_0_,

    countrysub2_.NM AS col_59_0_,

    countrysub2_.CNTRY_SBDVSN_CD AS col_60_0_,

    trilliumex3_.TRLLM_EXTNSN_ID AS col_61_0_,

    trilliumex3_.VRSN AS col_62_0_,

    usagetype4_.TYP_ID AS col_63_0_,

    usagetype4_.VRSN AS col_64_0_,

    countrysub2_.GGRPHC_AR_ID AS col_65_0_,

    countrysub2_1_.VRSN AS col_66_0_

    FROM D_BUSINESS_PARTNER.V_BSNSS_PRTNR businesspa0_ LEFT OUTER JOIN D_BUSINESS_PARTNER.V_CNTCT_MCHNSM contactmec1_ ON businesspa0_.BSNSS_PRTNR_ID=contactmec1_.BSNSS_PRTNR AND contactmec1_.ACTV <> 'N' LEFT OUTER JOIN D_REFERENCE_DATA.V_CNTRY_SBDVSN countrysub2_ ON contactmec1_.CNTRY_SBDVSN=countrysub2_.GGRPHC_AR_ID LEFT OUTER JOIN D_REFERENCE_DATA.V_GGRPHC_AR countrysub2_1_ ON countrysub2_.GGRPHC_AR_ID=countrysub2_1_.GGRPHC_AR_ID LEFT OUTER JOIN D_BUSINESS_PARTNER.V_TRLLM_EXTNSN trilliumex3_ ON contactmec1_.CNTCT_MCHNSM_ID=trilliumex3_.CNTCT_MCHNSM_FK_0 AND trilliumex3_.ACTV <> 'N' LEFT OUTER JOIN D_BUSINESS_PARTNER.V_USG_TYP usagetype4_ ON contactmec1_.USG_TYP=usagetype4_.TYP_ID LEFT OUTER JOIN D_REFERENCE_DATA.V_CNTRY country5_ ON businesspa0_.CNTRY=country5_.GGRPHC_AR_ID LEFT OUTER JOIN D_REFERENCE_DATA.V_GGRPHC_AR country5_1_ ON country5_.GGRPHC_AR_ID=country5_1_.GGRPHC_AR_ID LEFT OUTER JOIN D_REFERENCE_DATA.V_LNGG language6_ ON businesspa0_.LNGG=language6_.CD_CL

    WHERE businesspa0_.ACTV <> 'N'

        AND  (businesspa0_.BSNSS_PRTNR_ID NOT IN  ('402138135'))

        AND  (businesspa0_.LFCYCL='draft')

        AND  usagetype4_.TYP='1'

        AND  (trilliumex3_.WNDW_KY_1 = '' OR trilliumex3_.WNDW_KY_2='' OR trilliumex3_.WNDW_KY_3='' OR trilliumex3_.WNDW_KY_4='' OR trilliumex3_.WNDW_KY_5='' OR trilliumex3_.WNDW_KY_6='');

Runtime Explain plan:

Session

347775

                                 

Request

5

                                 

Collection Time

37:34.7

                                 

Step Number

Confidence

Est. Rows

Actual Rows

Est. Time

Actual Time

Step Text

                       
                                     

1

0

0

48

0

0

First, lock T_BUSINESS_PARTNER.USG_TYP for access, we lock T_BUSINESS_PARTNER.TRLLM_EXTNSN for access, we lock T_BUSINESS_PARTNER.CNTCT_MCHNSM for access, we lock T_BUSINESS_PARTNER.BSNSS_PRTNR for access and we lock T_REFERENCE_DATA.CNTRY for access.

2

2

1

520695

2.027329

1.13

Next, we do an All-AMPs RETRIEVE step from T_BUSINESS_PARTNER.BSNSS_PRTNR by way of an all-rows scan into Spool 1968, which is redistributed by hash code to all AMPs.

3

2

1

520695

0.020052

1.01

We do an All-AMPs JOIN step from Spool 1968 (Last Use) by way of an all-rows scan, which is joined to table CNTRY. Spool 1968 and table CNTRY are left outer joined using a merge join . The result goes into Spool 1969, which is redistributed by hash code to all AMPs.

4

2

48

2.50E+07

0.034313

22.2

We a two-AMP JOIN step from Spool 1969 (Last Use) by way of an all-rows scan, which is joined to table GGRPHC_AR. Spool 1969 and table GGRPHC_AR are left outer joined using a merge join . The result goes into Spool 1970, which is duplicated on all AMPs.

5

2

1

524314

3.006738

1496.26

We do an All-AMPs JOIN step from Spool 1970 (Last Use) by way of an all-rows scan, which is joined to table CNTCT_MCHNSM. Spool 1970 and table CNTCT_MCHNSM are joined using a product join . The result goes into Spool 1971, which is redistributed by hash code to all AMPs.

6

2

1

482032

0.020025

1.1

We do an All-AMPs JOIN step from T_BUSINESS_PARTNER.USG_TYP by way of an all-rows scan, which is joined to Spool 1971. table USG_TYP and Spool 1971 are joined using a merge join . The result goes into Spool 1972, which is redistributed by hash code to all AMPs.

7

2

2

 

0.020152

0

We a two-AMP JOIN step from Spool 1972 (Last Use) by way of an all-rows scan, which is joined to table LNGG. Spool 1972 and table LNGG are left outer joined using a merge join . The result goes into Spool 1973, which is redistributed by hash code to all AMPs.

8

2

3

 

0.020237

 

We a two-AMP JOIN step from Spool 1973 (Last Use) by way of an all-rows scan, which is joined to table CNTRY_SBDVSN. Spool 1973 and table CNTRY_SBDVSN are left outer joined using a merge join . The result goes into Spool 1974, which is redistributed by hash code to all AMPs.

9

2

144

 

0.034383

 

We a two-AMP JOIN step from Spool 1974 (Last Use) by way of an all-rows scan, which is joined to table GGRPHC_AR. Spool 1974 and table GGRPHC_AR are left outer joined using a merge join . The result goes into Spool 1975, which is duplicated on all AMPs.

10

2

3

 

3.316798

 

We do an All-AMPs JOIN step from Spool 1975 (Last Use) by way of an all-rows scan, which is joined to table TRLLM_EXTNSN. Spool 1975 and table TRLLM_EXTNSN are joined using a dynamic hash join . The result goes into Spool 1967, which is built locally on the AMPs.

                                     
                                     
                                     

Thank you,

Sravan.

3 REPLIES
Senior Apprentice

Re: ABNORMAL BEHAVIOR OF QUERY IN TD14.10 after upgrade from TD14

Hi Sravan,

it's hard to read because the runtime explain table is skrewed up, but:

the optimizer assumes that this condition

    WHERE businesspa0_.ACTV <> 'N'
AND (businesspa0_.BSNSS_PRTNR_ID NOT IN ('402138135'))
AND (businesspa0_.LFCYCL='draft')

returns 1 row (or zero  rows-> estimated 1) instead of 520695, that's the root problem.

This explan is from the Websphere client?

Is this SQL always performing bad now or just in a few cases?

Assuming it's actually using Prepared SQL (and the SQL is from SQLA), did you check qryLogV.CacheFlag if it's using a generic (T or G) or a specific plan (S or A)? The optimizer decides to use generic or specific during the first few runs of the prepared statement, maybe this decision changed in 14.10 (due to unknown reasons) and it's doing a generic instead of the previous specific plan. This might be bad if the parameters actually result in hugely different number of rows returned. 

Btw, which stats exist for T_BUSINESS_PARTNER.BSNSS_PRTNR?

Hopefully you already contacted Teradata support.

Enthusiast

Re: ABNORMAL BEHAVIOR OF QUERY IN TD14.10 after upgrade from TD14

Hi Dieter,

Thanks a lot and sorry for late response.

Yes, when I have contacted TERADATA CS, we got to know that, its a bug with optimizer.

The optimizer doesn’t have the logic to handle the case with a parameter in “NOT IN” condition, and it incorrectly treats it as a Unique Index case, hence 1 row estimation and high confidence. DR 171447 spawned to fix the estimates for parameterized queries.

Thank you,

Sravan Kumar Bodla.

Re: ABNORMAL BEHAVIOR OF QUERY IN TD14.10 after upgrade from TD14

Hello Sravan, how Were you able to rule out this bug? Was this bug fix now applied to TD 14.10. We are planing to upgrade our system to 14.10, So curious to know the fixes :). We have a lot of queries hitting the database from business people with "NOT IN"  caluse.  Any info from you would be helpful to me.