Database

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

06-13-2011
11:41 AM

06-13-2011
11:41 AM

Hi Dieter!

I have shown the qry and the explain plan. I collected stats on LOV_DX column(SEC_TSK) and

PRNTACTcolumn(CRT_TM). After this the explain shows high confidence wherever these columns are used, however the query runs for a few minutes longer. Should we consider this an imprvmt in performnce?

I have also notice that there are some NUSI on which the stats have not been collected? Will collecting stats on these help?

Also I have read that reccomended stats may not always be right? So which ones out of the reccomended stats should be useD? Is there any other way to tune this?

Thanks a lot !

explain

SELECT X.c1 AS c1 , X.c2 AS c2 , X.c3 AS c3 , X.c4 AS c4 ,

X.c5 AS c5 , X.c6 AS c6 , X.c7 AS c7 , X.c8 AS c8 , X.c9 AS c9

FROM (

SELECT X.c1 AS c1 , X.c2 AS c2 , X.c3 AS c3 , X.c4 AS c4 , X.c5 AS c5 , X.c6 AS c6 , X.c7 AS c7 , X.c8 AS c8 ,

SUM ( X.c9 ) OVER ( partition BY X.c8 , X.c2 , X.c4 ) AS c9

FROM (

SELECT A."PER_NAME_MONTH" AS c1 , A."DAY_DT" AS c2 ,

CASE

WHEN A."DAY" = 'Monday' THEN '1. Monday'

WHEN A."DAY" = 'Tuesday' THEN '2. Tuesday'

WHEN A."DAY" = 'Wednesday' THEN '3. Wednesday'

WHEN A."DAY" = 'Thursday' THEN '4. Thursday'

WHEN A."DAY" = 'Friday' THEN '5. Friday'

WHEN A."DAY" = 'Saturday' THEN '6. Saturday'

WHEN A."DAY" = 'Sunday' THEN '7. Sunday'

ELSE A."DAY"

END AS c3 ,

CASE

WHEN E."CRT_TM" < 6000000 THEN 'a. 12am-6am'

WHEN E."CRT_TM" BETWEEN 6000000 AND 7000000 THEN 'b. 6am-7am'

WHEN E."CRT_TM" BETWEEN 7000001 AND 8000000 THEN 'c. 7am-8am'

WHEN E."CRT_TM" BETWEEN 8000001 AND 9000000 THEN 'd. 8am-9am'

WHEN E."CRT_TM" BETWEEN 9000001 AND 10000000 THEN 'e. 9am-10am'

WHEN E."CRT_TM" BETWEEN 10000001 AND 11000000 THEN 'f. 10am-11am'

WHEN E."CRT_TM" BETWEEN 11000001 AND 12000000 THEN 'g. 11am-12pm'

WHEN E."CRT_TM" BETWEEN 12000001 AND 13000000 THEN 'h. 12pm-1pm'

WHEN E."CRT_TM" BETWEEN 13000001 AND 14000000 THEN 'i. 1pm-2pm'

WHEN E."CRT_TM" BETWEEN 14000001 AND 15000000 THEN 'j. 2pm-3pm'

WHEN E."CRT_TM" BETWEEN 15000001 AND 16000000 THEN 'k. 3pm-4pm'

WHEN E."CRT_TM" >= 16000001 THEN 'l. After 4pm'

ELSE ''

END AS c4 ,

COUNT ( E."X_TSKID" ) AS c5 , D."ATR_03" AS c6 ,

D."ATR_04" AS c7 , D."SEC_TSK" AS c8 ,

COUNT ( E."ROW" ) AS c9

FROM "DAY_D" A , "LOV_D" B , "LOV_D" C ,

"LOV_DX" D , "PRNTACT" E

WHERE ( B."ROW" = D."ROW"

AND B."ROW" = E."ACT_TYPE"

AND A."ROW" = E."ACTUAL_START"

AND C."NAME" = D."ATR_01"

AND B."TYPE" = 'PI_TSK'

AND C."TYPE" = 'PI_DPT'

AND C."ROW" = E."X_DPT"

AND D."SEC_TSK" = 'Single Sum'

AND C."LONG_DESC" <> 'INC OC'

AND C."LONG_DESC" <> 'OUT OC'

AND A."DAY_DT" >= tImEsTaMp'2011-01-01 00:00:00'

AND E."CRT_TM" <= 16000000 )

GROUP BY c2 , c1 , c6 , c7, c8, c3 , c4

) X

QUALIFY RONUMBER ( ) OVER ( PARTITION BY X.c1 , X.c2 , X.c4 ,

X.c6 , X.c7 , X.c8

ORDER BY c1 , c2 , c4 , c6 ,c7 , c8 ) = 1 ) X

ORDER BY 6

1) First, we lock a distinct TF001."pseudo table" for read on a

RowHash to prevent global deadlock for TF001.E.

2) Next, we lock a distinct TF001."pseudo table" for read on a

RowHash to prevent global deadlock for TF001.D.

3) We lock a distinct TF001."pseudo table" for read on a RowHash to

prevent global deadlock for TF001.C.

4) We lock a distinct TF001."pseudo table" for read on a RowHash to

prevent global deadlock for TF001.A.

5) We lock TF001.E for read, we lock TF001.D for read, we

lock TF001.C for read, and we lock TF001.A for read.

6) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from TF001.D by way of

an all-rows scan with a condition of (

"(TF001.D.SEC_TSK = 'Single Sum') AND ((NOT

(TF001.D.ATR_01 IS NULL )) AND (NOT

(TF001.D.ROW IS NULL )))") into Spool 5 (all_amps)

(compressed columns allowed), which is built locally on the

AMPs. The size of Spool 5 is estimated with high confidence

to be 1,295 rows. The estimated time for this step is 0.00

seconds.

2) We do an all-AMPs RETRIEVE step from TF001.C by way of

index # 8 ""TYPE" = 'PI_DPT'" with a residual condition of (

"(TF001.C.LONG_DESC <> 'OUT OCC') AND

((TF001.C.LONG_DESC <> 'INC OCC') AND

(TF001.C."TYPE" = 'PI_DPT'))") into Spool 6 (all_amps)

(compressed columns allowed), which is duplicated on all AMPs.

The size of Spool 6 is estimated with high confidence to be

31,500 rows. The estimated time for this step is 0.04

seconds.

3) We do an all-AMPs RETRIEVE step from TF001.B by way of

an all-rows scan with a condition of ("TF001.B."TYPE" =

'PI_TSK'") into Spool 7 (all_amps) (compressed columns

allowed), which is redistributed by hash code to all AMPs.

The size of Spool 7 is estimated with high confidence to be

2,436 rows. The estimated time for this step is 0.02 seconds.

7) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of an

all-rows scan, which is joined to Spool 6 (Last Use) by way of an

all-rows scan. Spool 5 and Spool 6 are joined using a single

partition hash join, with a join condition of ("NAME = ATR_01").

The result goes into Spool 8 (all_amps) (compressed columns

allowed), which is redistributed by hash code to all AMPs. The

size of Spool 8 is estimated with low confidence to be 1,283 rows.

The estimated time for this step is 0.01 seconds.

8) We do an all-AMPs JOIN step from Spool 7 (Last Use) by way of an

all-rows scan, which is joined to Spool 8 (Last Use) by way of an

all-rows scan. Spool 7 and Spool 8 are joined using a single

partition hash join, with a join condition of ("ROW = ROW").

The result goes into Spool 9 (all_amps) (compressed columns

allowed), which is built locally on the AMPs. The size of Spool 9

is estimated with low confidence to be 1,283 rows. The estimated

time for this step is 0.01 seconds.

9) We do an all-AMPs RETRIEVE step from TF001.E by way of an

all-rows scan with a condition of ("(TF001.E.CRT_TM

<= 16000000.) AND (NOT (TF001.E.X_DPT IS NULL ))") into

Spool 10 (all_amps) (compressed columns allowed), which is

redistributed by hash code to all AMPs. The input table will not

be cached in memory, but it is eligible for synchronized scanning.

The size of Spool 10 is estimated with high confidence to be

40,589,897 rows. The estimated time for this step is 17.10

seconds.

10) We do an all-AMPs RETRIEVE step from TF001.A by way of an

all-rows scan with a condition of ("TF001.A.DAY_DT >=

TIMESTAMP '2011-01-01 00:00:00'") into Spool 11 (all_amps)

(compressed columns allowed) fanned out into 2 hash join

partitions, which is duplicated on all AMPs. The size of Spool 11

is estimated with high confidence to be 2,219,400 rows. The

estimated time for this step is 0.27 seconds.

11) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of an

all-rows scan, which is joined to Spool 10 (Last Use) by way of an

all-rows scan. Spool 9 and Spool 10 are joined using a product

join, with a join condition of ("(ACT_TYPE = ROW) AND

((ROW = ACT_TYPE) AND (ROW = X_DPT ))"). The

result goes into Spool 12 (all_amps) (compressed columns allowed),

which is built locally on the AMPs into 2 hash join partitions.

The size of Spool 12 is estimated with low confidence to be

8,297,776 rows. The estimated time for this step is 3.75 seconds.

12) We do an all-AMPs JOIN step from Spool 11 (Last Use) by way of an

all-rows scan, which is joined to Spool 12 (Last Use) by way of an

all-rows scan. Spool 11 and Spool 12 are joined using a hash join

of 2 partitions, with a join condition of ("ROW =

ACTUAL_START"). The result goes into Spool 4 (all_amps)

(compressed columns allowed), which is built locally on the AMPs.

The size of Spool 4 is estimated with low confidence to be

8,297,776 rows. The estimated time for this step is 0.80 seconds.

13) We do an all-AMPs SUM step to aggregate from Spool 4 (Last Use) by

way of an all-rows scan, and the grouping identifier in field 1.

Aggregate Intermediate Results are computed globally, then placed

in Spool 13. The size of Spool 13 is estimated with low

confidence to be 6,223,332 rows. The estimated time for this step

is 22.39 seconds.

14) We do an all-AMPs RETRIEVE step from Spool 13 (Last Use) by way of

an all-rows scan into Spool 1 (all_amps) (compressed columns

allowed), which is built locally on the AMPs. The size of Spool 1

is estimated with low confidence to be 6,223,332 rows. The

estimated time for this step is 0.86 seconds.

15) We do an all-AMPs STAT FUNCTION step from Spool 1 (Last Use) by

way of an all-rows scan into Spool 18 (Last Use), which is assumed

to be redistributed by value to all AMPs. The result rows are put

into Spool 16 (all_amps) (compressed columns allowed), which is

built locally on the AMPs.

16) We do an all-AMPs STAT FUNCTION step from Spool 16 (Last Use) by

way of an all-rows scan into Spool 21 (Last Use), which is assumed

to be redistributed by value to all AMPs. The result rows are put

into Spool 20 (all_amps) (compressed columns allowed), which is

built locally on the AMPs.

17) We do an all-AMPs RETRIEVE step from Spool 20 (Last Use) by way of

an all-rows scan with a condition of ("Field_11 = 1") into Spool 2

(all_amps) (compressed columns allowed), which is built locally on

the AMPs. The size of Spool 2 is estimated with no confidence to

be 6,223,332 rows. The estimated time for this step is 0.75

seconds.

18) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of

an all-rows scan into Spool 24 (group_amps), which is built

locally on the AMPs. Then we do a SORT to order Spool 24 by the

sort key in spool fielX. The size of Spool 24 is estimated with

no confidence to be 6,223,332 rows. The estimated time for this

step is 0.75 seconds.

19) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of Spool 24 are sent back to the user as the result

of statement 1.

BEGIN RECOMMENDED STATS ->

20) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

ACTUAL_START ,X_DPT ,CRT_TM)".

(HighConf)

21) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACTUAL_START ,

X_DPT)". (HighConf)

22) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACTUAL_START ,

X_DPT ,CRT_TM)". (HighConf)

23) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(X_DPT ,

ACT_TYPE ,ACTUAL_START)". (HighConf)

24) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

CRT_TM)". (LowConf)

25) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

X_DPT ,CRT_TM)". (HighConf)

26) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

ACTUAL_START ,CRT_TM)". (HighConf)

27) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACTUAL_START ,

CRT_TM)". (HighConf)

28) "COLLECT STATISTICS TF001.LOV_DX COLUMN (SEC_TSK

,

ROW)". (HighConf)

29) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ATR_01

,SEC_TSK ,

ATR_03 ,ATR_04)". (HighConf)

30) "COLLECT STATISTICS TF001.LOV_DX COLUMN (SEC_TSK

,

ATR_03 ,ATR_04)". (HighConf)

31) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ATR_04)".

(LowConf)

32) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ROW

,SEC_TSK ,

ATR_03 ,ATR_04)". (HighConf)

33) "COLLECT STATISTICS TF001.LOV_DX COLUMN (SEC_TSK

,

ATR_01)". (HighConf)

34) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ROW ,ATR_04)".

(LowConf)

35) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ROW ,ATR_01

,

SEC_TSK ,ATR_03 ,ATR_04)". (HighConf)

36) "COLLECT STATISTICS TF001.LOV_D COLUMN (NAME)". (HighConf)

<- END RECOMMENDED STATS

LOV_DX -no stats on nusi

INDEX LOV_DX_NUSI1 ( OWNR )

INDEX LOV_DX_NUSI2 ( TER_TSK );

show table LOV_D

UNIQUE PRIMARY INDEX LOV_D_UPI ( "TYPE" ,VAL )

INDEX LOV_D_NUSI2 ( ROW )

INDEX LOV_D_NUSI1 ( "TYPE" );

show table PRNTACT no stats on nusi

UNIQUE PRIMARY INDEX ( ROW )

INDEX PRNT_ACT_NUSI1 ( ACTL_END )

INDEX PRNT_ACT_NUSI2 ( ACT )

INDEX PRNT_ACT_NUSI3 (DPT )

INDEX PRNT_ACT_NUSI4 (DT_RCVD );

show table DAY_D

UNIQUE PRIMARY INDEX DAY_D_UPI ( ROW );

I have shown the qry and the explain plan. I collected stats on LOV_DX column(SEC_TSK) and

PRNTACTcolumn(CRT_TM). After this the explain shows high confidence wherever these columns are used, however the query runs for a few minutes longer. Should we consider this an imprvmt in performnce?

I have also notice that there are some NUSI on which the stats have not been collected? Will collecting stats on these help?

Also I have read that reccomended stats may not always be right? So which ones out of the reccomended stats should be useD? Is there any other way to tune this?

Thanks a lot !

explain

SELECT X.c1 AS c1 , X.c2 AS c2 , X.c3 AS c3 , X.c4 AS c4 ,

X.c5 AS c5 , X.c6 AS c6 , X.c7 AS c7 , X.c8 AS c8 , X.c9 AS c9

FROM (

SELECT X.c1 AS c1 , X.c2 AS c2 , X.c3 AS c3 , X.c4 AS c4 , X.c5 AS c5 , X.c6 AS c6 , X.c7 AS c7 , X.c8 AS c8 ,

SUM ( X.c9 ) OVER ( partition BY X.c8 , X.c2 , X.c4 ) AS c9

FROM (

SELECT A."PER_NAME_MONTH" AS c1 , A."DAY_DT" AS c2 ,

CASE

WHEN A."DAY" = 'Monday' THEN '1. Monday'

WHEN A."DAY" = 'Tuesday' THEN '2. Tuesday'

WHEN A."DAY" = 'Wednesday' THEN '3. Wednesday'

WHEN A."DAY" = 'Thursday' THEN '4. Thursday'

WHEN A."DAY" = 'Friday' THEN '5. Friday'

WHEN A."DAY" = 'Saturday' THEN '6. Saturday'

WHEN A."DAY" = 'Sunday' THEN '7. Sunday'

ELSE A."DAY"

END AS c3 ,

CASE

WHEN E."CRT_TM" < 6000000 THEN 'a. 12am-6am'

WHEN E."CRT_TM" BETWEEN 6000000 AND 7000000 THEN 'b. 6am-7am'

WHEN E."CRT_TM" BETWEEN 7000001 AND 8000000 THEN 'c. 7am-8am'

WHEN E."CRT_TM" BETWEEN 8000001 AND 9000000 THEN 'd. 8am-9am'

WHEN E."CRT_TM" BETWEEN 9000001 AND 10000000 THEN 'e. 9am-10am'

WHEN E."CRT_TM" BETWEEN 10000001 AND 11000000 THEN 'f. 10am-11am'

WHEN E."CRT_TM" BETWEEN 11000001 AND 12000000 THEN 'g. 11am-12pm'

WHEN E."CRT_TM" BETWEEN 12000001 AND 13000000 THEN 'h. 12pm-1pm'

WHEN E."CRT_TM" BETWEEN 13000001 AND 14000000 THEN 'i. 1pm-2pm'

WHEN E."CRT_TM" BETWEEN 14000001 AND 15000000 THEN 'j. 2pm-3pm'

WHEN E."CRT_TM" BETWEEN 15000001 AND 16000000 THEN 'k. 3pm-4pm'

WHEN E."CRT_TM" >= 16000001 THEN 'l. After 4pm'

ELSE ''

END AS c4 ,

COUNT ( E."X_TSKID" ) AS c5 , D."ATR_03" AS c6 ,

D."ATR_04" AS c7 , D."SEC_TSK" AS c8 ,

COUNT ( E."ROW" ) AS c9

FROM "DAY_D" A , "LOV_D" B , "LOV_D" C ,

"LOV_DX" D , "PRNTACT" E

WHERE ( B."ROW" = D."ROW"

AND B."ROW" = E."ACT_TYPE"

AND A."ROW" = E."ACTUAL_START"

AND C."NAME" = D."ATR_01"

AND B."TYPE" = 'PI_TSK'

AND C."TYPE" = 'PI_DPT'

AND C."ROW" = E."X_DPT"

AND D."SEC_TSK" = 'Single Sum'

AND C."LONG_DESC" <> 'INC OC'

AND C."LONG_DESC" <> 'OUT OC'

AND A."DAY_DT" >= tImEsTaMp'2011-01-01 00:00:00'

AND E."CRT_TM" <= 16000000 )

GROUP BY c2 , c1 , c6 , c7, c8, c3 , c4

) X

QUALIFY RONUMBER ( ) OVER ( PARTITION BY X.c1 , X.c2 , X.c4 ,

X.c6 , X.c7 , X.c8

ORDER BY c1 , c2 , c4 , c6 ,c7 , c8 ) = 1 ) X

ORDER BY 6

1) First, we lock a distinct TF001."pseudo table" for read on a

RowHash to prevent global deadlock for TF001.E.

2) Next, we lock a distinct TF001."pseudo table" for read on a

RowHash to prevent global deadlock for TF001.D.

3) We lock a distinct TF001."pseudo table" for read on a RowHash to

prevent global deadlock for TF001.C.

4) We lock a distinct TF001."pseudo table" for read on a RowHash to

prevent global deadlock for TF001.A.

5) We lock TF001.E for read, we lock TF001.D for read, we

lock TF001.C for read, and we lock TF001.A for read.

6) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from TF001.D by way of

an all-rows scan with a condition of (

"(TF001.D.SEC_TSK = 'Single Sum') AND ((NOT

(TF001.D.ATR_01 IS NULL )) AND (NOT

(TF001.D.ROW IS NULL )))") into Spool 5 (all_amps)

(compressed columns allowed), which is built locally on the

AMPs. The size of Spool 5 is estimated with high confidence

to be 1,295 rows. The estimated time for this step is 0.00

seconds.

2) We do an all-AMPs RETRIEVE step from TF001.C by way of

index # 8 ""TYPE" = 'PI_DPT'" with a residual condition of (

"(TF001.C.LONG_DESC <> 'OUT OCC') AND

((TF001.C.LONG_DESC <> 'INC OCC') AND

(TF001.C."TYPE" = 'PI_DPT'))") into Spool 6 (all_amps)

(compressed columns allowed), which is duplicated on all AMPs.

The size of Spool 6 is estimated with high confidence to be

31,500 rows. The estimated time for this step is 0.04

seconds.

3) We do an all-AMPs RETRIEVE step from TF001.B by way of

an all-rows scan with a condition of ("TF001.B."TYPE" =

'PI_TSK'") into Spool 7 (all_amps) (compressed columns

allowed), which is redistributed by hash code to all AMPs.

The size of Spool 7 is estimated with high confidence to be

2,436 rows. The estimated time for this step is 0.02 seconds.

7) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of an

all-rows scan, which is joined to Spool 6 (Last Use) by way of an

all-rows scan. Spool 5 and Spool 6 are joined using a single

partition hash join, with a join condition of ("NAME = ATR_01").

The result goes into Spool 8 (all_amps) (compressed columns

allowed), which is redistributed by hash code to all AMPs. The

size of Spool 8 is estimated with low confidence to be 1,283 rows.

The estimated time for this step is 0.01 seconds.

8) We do an all-AMPs JOIN step from Spool 7 (Last Use) by way of an

all-rows scan, which is joined to Spool 8 (Last Use) by way of an

all-rows scan. Spool 7 and Spool 8 are joined using a single

partition hash join, with a join condition of ("ROW = ROW").

The result goes into Spool 9 (all_amps) (compressed columns

allowed), which is built locally on the AMPs. The size of Spool 9

is estimated with low confidence to be 1,283 rows. The estimated

time for this step is 0.01 seconds.

9) We do an all-AMPs RETRIEVE step from TF001.E by way of an

all-rows scan with a condition of ("(TF001.E.CRT_TM

<= 16000000.) AND (NOT (TF001.E.X_DPT IS NULL ))") into

Spool 10 (all_amps) (compressed columns allowed), which is

redistributed by hash code to all AMPs. The input table will not

be cached in memory, but it is eligible for synchronized scanning.

The size of Spool 10 is estimated with high confidence to be

40,589,897 rows. The estimated time for this step is 17.10

seconds.

10) We do an all-AMPs RETRIEVE step from TF001.A by way of an

all-rows scan with a condition of ("TF001.A.DAY_DT >=

TIMESTAMP '2011-01-01 00:00:00'") into Spool 11 (all_amps)

(compressed columns allowed) fanned out into 2 hash join

partitions, which is duplicated on all AMPs. The size of Spool 11

is estimated with high confidence to be 2,219,400 rows. The

estimated time for this step is 0.27 seconds.

11) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of an

all-rows scan, which is joined to Spool 10 (Last Use) by way of an

all-rows scan. Spool 9 and Spool 10 are joined using a product

join, with a join condition of ("(ACT_TYPE = ROW) AND

((ROW = ACT_TYPE) AND (ROW = X_DPT ))"). The

result goes into Spool 12 (all_amps) (compressed columns allowed),

which is built locally on the AMPs into 2 hash join partitions.

The size of Spool 12 is estimated with low confidence to be

8,297,776 rows. The estimated time for this step is 3.75 seconds.

12) We do an all-AMPs JOIN step from Spool 11 (Last Use) by way of an

all-rows scan, which is joined to Spool 12 (Last Use) by way of an

all-rows scan. Spool 11 and Spool 12 are joined using a hash join

of 2 partitions, with a join condition of ("ROW =

ACTUAL_START"). The result goes into Spool 4 (all_amps)

(compressed columns allowed), which is built locally on the AMPs.

The size of Spool 4 is estimated with low confidence to be

8,297,776 rows. The estimated time for this step is 0.80 seconds.

13) We do an all-AMPs SUM step to aggregate from Spool 4 (Last Use) by

way of an all-rows scan, and the grouping identifier in field 1.

Aggregate Intermediate Results are computed globally, then placed

in Spool 13. The size of Spool 13 is estimated with low

confidence to be 6,223,332 rows. The estimated time for this step

is 22.39 seconds.

14) We do an all-AMPs RETRIEVE step from Spool 13 (Last Use) by way of

an all-rows scan into Spool 1 (all_amps) (compressed columns

allowed), which is built locally on the AMPs. The size of Spool 1

is estimated with low confidence to be 6,223,332 rows. The

estimated time for this step is 0.86 seconds.

15) We do an all-AMPs STAT FUNCTION step from Spool 1 (Last Use) by

way of an all-rows scan into Spool 18 (Last Use), which is assumed

to be redistributed by value to all AMPs. The result rows are put

into Spool 16 (all_amps) (compressed columns allowed), which is

built locally on the AMPs.

16) We do an all-AMPs STAT FUNCTION step from Spool 16 (Last Use) by

way of an all-rows scan into Spool 21 (Last Use), which is assumed

to be redistributed by value to all AMPs. The result rows are put

into Spool 20 (all_amps) (compressed columns allowed), which is

built locally on the AMPs.

17) We do an all-AMPs RETRIEVE step from Spool 20 (Last Use) by way of

an all-rows scan with a condition of ("Field_11 = 1") into Spool 2

(all_amps) (compressed columns allowed), which is built locally on

the AMPs. The size of Spool 2 is estimated with no confidence to

be 6,223,332 rows. The estimated time for this step is 0.75

seconds.

18) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of

an all-rows scan into Spool 24 (group_amps), which is built

locally on the AMPs. Then we do a SORT to order Spool 24 by the

sort key in spool fielX. The size of Spool 24 is estimated with

no confidence to be 6,223,332 rows. The estimated time for this

step is 0.75 seconds.

19) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of Spool 24 are sent back to the user as the result

of statement 1.

BEGIN RECOMMENDED STATS ->

20) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

ACTUAL_START ,X_DPT ,CRT_TM)".

(HighConf)

21) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACTUAL_START ,

X_DPT)". (HighConf)

22) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACTUAL_START ,

X_DPT ,CRT_TM)". (HighConf)

23) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(X_DPT ,

ACT_TYPE ,ACTUAL_START)". (HighConf)

24) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

CRT_TM)". (LowConf)

25) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

X_DPT ,CRT_TM)". (HighConf)

26) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACT_TYPE ,

ACTUAL_START ,CRT_TM)". (HighConf)

27) "COLLECT STATISTICS TF001.PRNTACT COLUMN

(ACTUAL_START ,

CRT_TM)". (HighConf)

28) "COLLECT STATISTICS TF001.LOV_DX COLUMN (SEC_TSK

,

ROW)". (HighConf)

29) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ATR_01

,SEC_TSK ,

ATR_03 ,ATR_04)". (HighConf)

30) "COLLECT STATISTICS TF001.LOV_DX COLUMN (SEC_TSK

,

ATR_03 ,ATR_04)". (HighConf)

31) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ATR_04)".

(LowConf)

32) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ROW

,SEC_TSK ,

ATR_03 ,ATR_04)". (HighConf)

33) "COLLECT STATISTICS TF001.LOV_DX COLUMN (SEC_TSK

,

ATR_01)". (HighConf)

34) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ROW ,ATR_04)".

(LowConf)

35) "COLLECT STATISTICS TF001.LOV_DX COLUMN (ROW ,ATR_01

,

SEC_TSK ,ATR_03 ,ATR_04)". (HighConf)

36) "COLLECT STATISTICS TF001.LOV_D COLUMN (NAME)". (HighConf)

<- END RECOMMENDED STATS

LOV_DX -no stats on nusi

INDEX LOV_DX_NUSI1 ( OWNR )

INDEX LOV_DX_NUSI2 ( TER_TSK );

show table LOV_D

UNIQUE PRIMARY INDEX LOV_D_UPI ( "TYPE" ,VAL )

INDEX LOV_D_NUSI2 ( ROW )

INDEX LOV_D_NUSI1 ( "TYPE" );

show table PRNTACT no stats on nusi

UNIQUE PRIMARY INDEX ( ROW )

INDEX PRNT_ACT_NUSI1 ( ACTL_END )

INDEX PRNT_ACT_NUSI2 ( ACT )

INDEX PRNT_ACT_NUSI3 (DPT )

INDEX PRNT_ACT_NUSI4 (DT_RCVD );

show table DAY_D

UNIQUE PRIMARY INDEX DAY_D_UPI ( ROW );

3 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

06-13-2011
03:11 PM

06-13-2011
03:11 PM

Without stats the optimizer will use a NUSI only when access is based on equality, "NUSIcol = value".

So stats on NUSI are usually recommended, read Carrie's articles on stats, especially

http://developer.teradata.com/database/articles/statistics-collection-recommendations-for-teradata-12

A recommended stat might be usefull if the column is used a lot in other queries, too.

Regarding optimization, i don't think you need the QUALIFY as it's on the same columns used in the previous GROUP BY.

Dieter

So stats on NUSI are usually recommended, read Carrie's articles on stats, especially

http://developer.teradata.com/database/articles/statistics-collection-recommendations-for-teradata-12

A recommended stat might be usefull if the column is used a lot in other queries, too.

Regarding optimization, i don't think you need the QUALIFY as it's on the same columns used in the previous GROUP BY.

Dieter

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

06-14-2011
07:05 AM

06-14-2011
07:05 AM

Thank you so much for responding.

Regarding the Qualify, the columns in Group by are different than the Partition by . So will it give the same result if I remove Qualify?

Also, as far as I understand the qualify =1 will remove the dupes based on columns in partition, please correct me if I am wrong.

I also tried using the equality conditions in where clause as below but this just increased the CPU skew to 98% in PMON.

Doesn't filtering like this improve the perf?

FROM ( sel * from "DAY_D" where "DAY_DT" >= tImEsTaMp'2011-01-01 00:00:00' ) A ,

( sel * from "LOV_D" where "TYPE" = 'PI_TSK' ) B ,

(sel * from "LOV_D" where "LONG_DESC" <> 'C OC' and "LONG_DESC" <> 'T OC' and "TYPE" = 'PI_DPT' ) C ,

(sel * from "LOV_DX" where "SEC_TSK" = 'Single' ) D ,

( sel * from "PRNTACT" where "CRT_TM" <= 16000000 ) E

WHERE ( ....

Also, generally I collect stats on all the cols in WHERE ocndition. What would you suggest regarding collecting stats?

What are the things in explain that I should look for ..like no confidence?

Sorry, too many qsns but I have an experience taht you generally follow up till the query gets resolved...and this time I realy expect it as it will help me clear all my doubts.

Regarding the Qualify, the columns in Group by are different than the Partition by . So will it give the same result if I remove Qualify?

Also, as far as I understand the qualify =1 will remove the dupes based on columns in partition, please correct me if I am wrong.

I also tried using the equality conditions in where clause as below but this just increased the CPU skew to 98% in PMON.

Doesn't filtering like this improve the perf?

FROM ( sel * from "DAY_D" where "DAY_DT" >= tImEsTaMp'2011-01-01 00:00:00' ) A ,

( sel * from "LOV_D" where "TYPE" = 'PI_TSK' ) B ,

(sel * from "LOV_D" where "LONG_DESC" <> 'C OC' and "LONG_DESC" <> 'T OC' and "TYPE" = 'PI_DPT' ) C ,

(sel * from "LOV_DX" where "SEC_TSK" = 'Single' ) D ,

( sel * from "PRNTACT" where "CRT_TM" <= 16000000 ) E

WHERE ( ....

Also, generally I collect stats on all the cols in WHERE ocndition. What would you suggest regarding collecting stats?

What are the things in explain that I should look for ..like no confidence?

Sorry, too many qsns but I have an experience taht you generally follow up till the query gets resolved...and this time I realy expect it as it will help me clear all my doubts.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

06-14-2011
09:52 AM

06-14-2011
09:52 AM

The only additional column in the GROUP BY is c3, which returns the day of week.

A."DAY_DT" sounds like a date, i don't think there are different weekdays for any given date.

When there are no dups then why try to remove them?

The PARTITION and the ORDER BY in your ROW_NUMBER are exactly the same, if there are multiple rows per partition this will return one of them randomly. You could order by a single column instead of six columns and get the same answer.

You should check if the explain changed after using those Derived Tables.

As there's nothing complex like distinct/aggregation/OLAP in those DTs i would not expect a different plan, but you never know.

You can't collect stats on all those columns found in any WHERE-condition, those stats have to be recollected at a later time. So only collect on those columns used in different queries and only if they actually change the plan.

When you already use DIAGNOSTIC HELPSTATS you don't have to search for missing stats anymore, but you should check if an existing stat results in high (at least low) confidence. If not it's useless.

Dieter

A."DAY_DT" sounds like a date, i don't think there are different weekdays for any given date.

When there are no dups then why try to remove them?

The PARTITION and the ORDER BY in your ROW_NUMBER are exactly the same, if there are multiple rows per partition this will return one of them randomly. You could order by a single column instead of six columns and get the same answer.

You should check if the explain changed after using those Derived Tables.

As there's nothing complex like distinct/aggregation/OLAP in those DTs i would not expect a different plan, but you never know.

You can't collect stats on all those columns found in any WHERE-condition, those stats have to be recollected at a later time. So only collect on those columns used in different queries and only if they actually change the plan.

When you already use DIAGNOSTIC HELPSTATS you don't have to search for missing stats anymore, but you should check if an existing stat results in high (at least low) confidence. If not it's useless.

Dieter

Copyright © 2004-2015 Teradata Corporation. Your use of this Teradata website is governed by the Privacy Policy and the Terms of Use, including your rights to materials on this website, the rights you grant to your submissions to this website, and your responsibilities regarding your conduct on this website.

The Privacy Policy and Terms of Use for this Teradata website changed effective September 8, 2016.