Database
Enthusiast

What is mean by STAT FUNCTION in Teradata

Hello Experts,

I am new to Teradata, Kindly let me know what is mean by STATS FUNCTION in teradata. Or what it does.

In the Explain plan of query,

"We do an All-AMPs STAT FUNCTION step from Spool 1475 (Last Use) by way of an all-rows scan into Spool 1484, which is redistributed by hash code to all AMPs. The result rows are put into Spool 1482, which is redistributed by hash code to all AMPs."

What its exactly doing in this step. Due to this my query is stuck in this step.

Urgent help required.

Thanks,

9 REPLIES
Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

Hi,

Try to use sample in place of TOP .

SELECT * FROM table SAMPLE 10;

Thanks

Nagendra

Senior Apprentice

Re: What is mean by STAT FUNCTION in Teradata

What's the remaining text of this step?

STAT FUCTION step might be TOP or a Windowed Aggregate Function like RANK or SUM OVER.

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

The query is stucked in 7th step. It is not executing. I am unable to get the output.

Kindly suggest how to overcome from STATS Function.

The query is....

select distinct sum(D1.c1) over (partition by D1.c2)  as c1,

D1.c2 as c2,

D1.c3 as c3,

D1.c4 as c4,

D1.c5 as c5,

D1.c6 as c6,

D1.c7 as c7,

D1.c8 as c8,

D1.c9 as c9,

sum(D1.c10) over (partition by D1.c11, D1.c2)  as c10,

D1.c11 as c11

from

(select count(distinct case D1.c16 when 1 then D1.c17 else NULL end ) as c1,

D1.c2 as c2,

count(distinct D1.c17) as c3,

sum(D1.c19) as c4,

sum(D1.c20) as c5,

sum(D1.c23) as c6,

sum(D1.c24) as c7,

D1.c8 as c8,

D1.c9 as c9,

count(distinct case D1.c25 when 1 then D1.c17 else NULL end ) as c10,

D1.c11 as c11

from

(select case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end  as c2,

T62659."DAY_OF_MONTH" as c8,

substring(T62659."MONTH_NAME"  from 1, 3) as c9,

case  when T85462."X_SR_CAT_TYPE_CD" in ('Accident', 'Break-down') then 'Accidented' else 'Normal' end  as c11,

T61292."ORDER_TYPE" as c12,

T61292."ORDER_DT" as c13,

T61292."X_CLS_DT" as c14,

T12426."ORDER_WID" as c15,

ROW_NUMBER() OVER (PARTITION BY case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end , case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  ORDER BY case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end  DESC, case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  DESC) as c16,

case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  as c17,

T61292."STATUS_CD" as c18,

case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."STATUS_CD" = 'Closed' then 1 else 0 end  as c19,

case  when T61292."ORDER_TYPE" = 'Service Order' and not T61292."STATUS_CD" in ('Cancel', 'Cancelled') then 1 else 0 end  as c20,

T61292."X_ATTRIB_27" as c21,

T61292."X_ORD_DT_CREATED" as c22,

case  when ( CAST((( cast(( T61292."X_ORD_DT_CREATED" + CAST( 330 AS INTERVAL MINUTE(4) ) ) as  TIMESTAMP  )  -  T61292."X_ATTRIB_27" ) DAY(4)) AS INTEGER) ) = 0 and T61292."ORDER_TYPE" = 'Service Order' and T61292."STATUS_CD" = 'Closed' then 1 else 0 end  as c23,

case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."STATUS_CD" = 'Parts Unavailable' then 1 else 0 end  as c24,

ROW_NUMBER() OVER (PARTITION BY case  when T85462."X_SR_CAT_TYPE_CD" in ('Accident', 'Break-down') then 'Accidented' else 'Normal' end , case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end , case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  ORDER BY case  when T85462."X_SR_CAT_TYPE_CD" in ('Accident', 'Break-down') then 'Accidented' else 'Normal' end  DESC, case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end  DESC, case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  DESC) as c25

from

"W_DAY_D" T62659 /* Generic Date (W_DAY_D) */ ,

"W_ORDER_F" T12426,

"W_ORDER_D" T61292 /* Job Card (W_ORDER_D) */ ,

"WC_SRVREQ_EXTERNAL_D" T85462

where  ( T12426."ORDER_DT_WID" = T62659."ROW_WID" and T12426."ORDER_WID" = T61292."ROW_WID" and T12426."SR_WID" = T85462."ROW_WID" )

) D1

group by D1.c2, D1.c8, D1.c9, D1.c11

) D1

=================================================

Explain plan...

1) First, we lock PROD_OLAP.W_ORDER_D_PDM for access, we lock

PROD_OLAP.WC_SRVREQ_EXTERNAL_D for access, we lock

PROD_OLAP.W_ORDER_F_PDM for access, and we lock PROD_OLAP.W_DAY_D

for access.

2) Next, we execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from

PROD_OLAP.WC_SRVREQ_EXTERNAL_D by way of an all-rows scan

with no residual conditions into Spool 4 (all_amps)

(compressed columns allowed) fanned out into 7 hash join

partitions, which is built locally on the AMPs.  The input

table will not be cached in memory, but it is eligible for

synchronized scanning.  The size of Spool 4 is estimated with

high confidence to be 56,229,300 rows (1,855,566,900 bytes).

The estimated time for this step is 1 minute and 5 seconds.

2) We do an all-AMPs RETRIEVE step from PROD_OLAP.W_ORDER_F_PDM

by way of an all-rows scan with no residual conditions into

Spool 5 (all_amps) (compressed columns allowed) fanned out

into 7 hash join partitions, which is redistributed by the

hash code of (PROD_OLAP.W_ORDER_F_PDM.SR_WID) to all AMPs.

The input table will not be cached in memory, but it is

eligible for synchronized scanning.  The size of Spool 5 is

estimated with high confidence to be 84,522,879 rows (

3,127,346,523 bytes).  The estimated time for this step is 6

minutes and 9 seconds.

3) We do an all-AMPs RETRIEVE step from PROD_OLAP.W_DAY_D by way

of an all-rows scan with no residual conditions into Spool 6

(all_amps) (compressed columns allowed), which is duplicated

on all AMPs.  The size of Spool 6 is estimated with high

confidence to be 2,732,112 rows (120,212,928 bytes).  The

estimated time for this step is 0.17 seconds.

3) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an

all-rows scan, which is joined to Spool 5 (Last Use) by way of an

all-rows scan.  Spool 4 and Spool 5 are joined using a hash join

of 7 partitions, with a join condition of ("SR_WID = ROW_WID").

The result goes into Spool 7 (all_amps) (compressed columns

allowed), which is built locally on the AMPs.  The size of Spool 7

is estimated with low confidence to be 84,522,879 rows (

3,465,438,039 bytes).  The estimated time for this step is 20.17

seconds.

4) We do an all-AMPs RETRIEVE step from PROD_OLAP.W_ORDER_D_PDM by

way of an all-rows scan with no residual conditions into Spool 8

(all_amps) (compressed columns allowed) fanned out into 24 hash

join partitions, which is built locally on the AMPs.  The input

table will not be cached in memory, but it is eligible for

synchronized scanning.  The size of Spool 8 is estimated with high

confidence to be 84,484,273 rows (7,181,163,205 bytes).  The

estimated time for this step is 1 minute and 37 seconds.

5) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an

all-rows scan, which is joined to Spool 7 (Last Use) by way of an

all-rows scan.  Spool 6 and Spool 7 are joined using a single

partition hash join, with a join condition of ("ORDER_DT_WID =

ROW_WID").  The result goes into Spool 9 (all_amps) (compressed

columns allowed), which is redistributed by the hash code of (

PROD_OLAP.W_ORDER_F_PDM.ORDER_WID) to all AMPs into 24 hash join

partitions.  The size of Spool 9 is estimated with low confidence

to be 84,522,879 rows (4,733,281,224 bytes).  The estimated time

for this step is 12.37 seconds.

6) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an

all-rows scan, which is joined to Spool 9 (Last Use) by way of an

all-rows scan.  Spool 8 and Spool 9 are joined using a hash join

of 24 partitions, with a join condition of ("ORDER_WID = ROW_WID").

The result goes into Spool 3 (all_amps) (compressed columns

allowed), which is built locally on the AMPs.  The size of Spool 3

is estimated with low confidence to be 84,522,879 rows (

10,311,791,238 bytes).  The estimated time for this step is 14.99

seconds.

7) We do an all-AMPs STAT FUNCTION step from Spool 3 (Last Use) by

way of an all-rows scan into Spool 12 (Last Use), which is

redistributed by hash code to all AMPs.  The result rows are put

into Spool 10 (all_amps) (compressed columns allowed), which is

built locally on the AMPs.  The size is estimated with low

confidence to be 84,522,879 rows (6,508,261,683 bytes).

8) We do an all-AMPs STAT FUNCTION step from Spool 10 (Last Use) by

way of an all-rows scan into Spool 15 (Last Use), which is

redistributed by hash code to all AMPs.  The result rows are put

into Spool 14 (all_amps) (compressed columns allowed), which is

built locally on the AMPs.  The size is estimated with low

confidence to be 84,522,879 rows (10,565,359,875 bytes).

9) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by way of

an all-rows scan into Spool 1 (used to materialize view, derived

table or table function D1) (all_amps) (compressed columns

allowed), which is built locally on the AMPs.  The size of Spool 1

is estimated with low confidence to be 84,522,879 rows (

4,817,804,103 bytes).  The estimated time for this step is 5.48

seconds.

10) We do an all-AMPs SUM step to aggregate from Spool 1 by way of an

all-rows scan , grouping by field1 ( D1.C2 ,D1.C8 ,D1.C9 ,D1.C11

,( CASE WHEN (D1.C25 = 1) THEN (( CASE WHEN ((ORDER_TYPE =

'Service Order') AND (((CAST((ORDER_DT) AS DATE))<= DATE

'2014-06-30') AND (((CAST((PROD_OLAP.W_ORDER_D_PDM.X_CLS_DT) AS

DATE))> DATE '2014-06-30') OR (PROD_OLAP.W_ORDER_D_PDM.X_CLS_DT IS

NULL )))) THEN (PROD_OLAP.W_ORDER_F_PDM.ORDER_WID) ELSE (NULL) END

)) ELSE (NULL) END)).  Aggregate Intermediate Results are computed

globally, then placed in Spool 24.  The size of Spool 24 is

estimated with low confidence to be 84,484,203 rows (

7,688,062,473 bytes).  The estimated time for this step is 54.89

seconds.

11) We do an all-AMPs SUM step to aggregate from Spool 24 (Last Use)

by way of an all-rows scan, and the grouping identifier in field 1.

Aggregate Intermediate Results are computed globally, then placed

in Spool 26.  The size of Spool 26 is estimated with low

confidence to be 37,946 rows (3,604,870 bytes).  The estimated

time for this step is 4.70 seconds.

12) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from Spool 26 (Last Use) by

way of an all-rows scan into Spool 21 (all_amps) (compressed

columns allowed), which is built locally on the AMPs.  The

size of Spool 21 is estimated with low confidence to be 37,946

rows (2,162,922 bytes).  The estimated time for this step is

0.03 seconds.

2) We do an all-AMPs SUM step to aggregate from Spool 1 by way of

an all-rows scan, and the grouping identifier in field 1.

Aggregate Intermediate Results are computed globally, then

placed in Spool 30.  The size of Spool 30 is estimated with

low confidence to be 84,484,203 rows (7,350,125,661 bytes).

The estimated time for this step is 46.98 seconds.

13) We do an all-AMPs SUM step to aggregate from Spool 30 (Last Use)

by way of an all-rows scan, and the grouping identifier in field 1.

Aggregate Intermediate Results are computed globally, then placed

in Spool 32.  The size of Spool 32 is estimated with low

confidence to be 37,946 rows (2,997,734 bytes).  The estimated

time for this step is 4.03 seconds.

14) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from Spool 32 (Last Use) by

way of an all-rows scan into Spool 20 (all_amps) (compressed

columns allowed), which is built locally on the AMPs.  The

size of Spool 20 is estimated with low confidence to be 37,946

rows (1,555,786 bytes).  The estimated time for this step is

0.03 seconds.

2) We do an all-AMPs SUM step to aggregate from Spool 1 (Last

Use) by way of an all-rows scan, and the grouping identifier

in field 1.  Aggregate Intermediate Results are computed

globally, then placed in Spool 36.  The size of Spool 36 is

estimated with low confidence to be 84,484,203 rows (

7,350,125,661 bytes).  The estimated time for this step is

46.98 seconds.

15) We do an all-AMPs SUM step to aggregate from Spool 36 (Last Use)

by way of an all-rows scan, and the grouping identifier in field 1.

Aggregate Intermediate Results are computed globally, then placed

in Spool 38.  The size of Spool 38 is estimated with low

confidence to be 37,946 rows (2,997,734 bytes).  The estimated

time for this step is 4.03 seconds.

16) We do an all-AMPs RETRIEVE step from Spool 38 (Last Use) by way of

an all-rows scan into Spool 19 (all_amps) (compressed columns

allowed), which is built locally on the AMPs.  The size of Spool

19 is estimated with low confidence to be 37,946 rows (1,555,786

bytes).  The estimated time for this step is 0.03 seconds.

17) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from Spool 19 (Last Use) by

way of an all-rows scan into Spool 40 (all_amps) (compressed

columns allowed), which is redistributed by hash code to all

AMPs.  Then we do a SORT to order Spool 40 by row hash.  The

size of Spool 40 is estimated with low confidence to be 37,946

rows (1,252,218 bytes).  The estimated time for this step is

0.03 seconds.

2) We do an all-AMPs RETRIEVE step from Spool 20 (Last Use) by

way of an all-rows scan into Spool 41 (all_amps) (compressed

columns allowed), which is redistributed by hash code to all

AMPs.  Then we do a SORT to order Spool 41 by row hash.  The

size of Spool 41 is estimated with low confidence to be 37,946

rows (1,252,218 bytes).  The estimated time for this step is

0.03 seconds.

18) We execute the following steps in parallel.

1) We do an all-AMPs JOIN step from Spool 40 (Last Use) by way of

a RowHash match scan, which is joined to Spool 41 (Last Use)

by way of a RowHash match scan.  Spool 40 and Spool 41 are

joined using a merge join, with a join condition of ("(C11 =

C11) AND ((C9 = C9) AND ((C8 = C8) AND (C2 = C2 )))").  The

result goes into Spool 42 (all_amps) (compressed columns

allowed), which is built locally on the AMPs.  The size of

Spool 42 is estimated with low confidence to be 37,255 rows (

1,378,435 bytes).  The estimated time for this step is 0.04

seconds.

2) We do an all-AMPs RETRIEVE step from Spool 21 (Last Use) by

way of an all-rows scan into Spool 43 (all_amps) (compressed

columns allowed), which is redistributed by hash code to all

AMPs.  Then we do a SORT to order Spool 43 by row hash.  The

size of Spool 43 is estimated with low confidence to be 37,946

rows (1,859,354 bytes).  The estimated time for this step is

0.03 seconds.

19) We do an all-AMPs JOIN step from Spool 42 (Last Use) by way of a

RowHash match scan, which is joined to Spool 43 (Last Use) by way

of a RowHash match scan.  Spool 42 and Spool 43 are joined using a

merge join, with a join condition of ("(C11 = C11) AND ((C9 = C9)

AND ((C8 = C8) AND (C2 = C2 )))").  The result goes into Spool 2

(used to materialize view, derived table or table function D1)

(all_amps) (compressed columns allowed), which is built locally on

the AMPs.  The size of Spool 2 is estimated with low confidence to

be 37,255 rows (2,421,575 bytes).  The estimated time for this

step is 0.04 seconds.

20) We do an all-AMPs STAT FUNCTION step from Spool 2 (Last Use) by

way of an all-rows scan into Spool 48 (Last Use), which is

redistributed by hash code to all AMPs.  The result rows are put

into Spool 46 (all_amps) (compressed columns allowed), which is

built locally on the AMPs.  The size is estimated with low

confidence to be 37,255 rows (2,421,575 bytes).

21) We do an all-AMPs STAT FUNCTION step from Spool 46 (Last Use) by

way of an all-rows scan into Spool 51 (Last Use), which is

redistributed by hash code to all AMPs.  The result rows are put

into Spool 44 (group_amps), which is redistributed by hash code to

all AMPs.  Then we do a SORT to order Spool 44 by the sort key in

spool field1 eliminating duplicate rows.  The size is estimated

with low confidence to be 37,255 rows (5,662,760 bytes).

22) Finally, we send out an END TRANSACTION step to all AMPs involved

in processing the request.

-> The contents of Spool 44 are sent back to the user as the result

of statement 1.

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

Has this query run ok before?

I can't duplicate this query because of it's size,  but one thing to try is to remove the distinct clause on the aggregates and replace with a group by on the non aggregate columns. Try explaining to see if the plan is better.

Rglass

Senior Apprentice

Re: What is mean by STAT FUNCTION in Teradata

Step 7 is one of the ROW_NUMBERs, both are a bit strange as they PARTITION and ORDER BY the same columns. You might simply change the ORDER BY to ORDER BY 1 instead.

And you might not need them at all, because they're only used in the outer COUNT(DISTINCT...) and you dont need them IMHO, try count(distinct D1.c17), this should return the same result.

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

Hello Dnoeth,

Do you think here I am missing any stats ?? I checked all the stats are up to date.

Do I need to change the query ? or without changing the query we can tune this query. ?

Previously this query was running fine. Plz help me to tune this query.

Regards,

Senior Apprentice

Re: What is mean by STAT FUNCTION in Teradata

What do you mean by "Previously this query was running fine"?

Exactly the same query, similar data?

Can you check if DBQL step/explain information is available for the previous run?

Is the estimated number of rows close to the actual number?

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

As per our developer team, this query is not new in the system. They are using this query from long back.

TD release is 13.10 and total size of system is 36 TB. Out of 36TB, 11 TB is in used.

Senior Apprentice

Re: What is mean by STAT FUNCTION in Teradata

Ask the developer team if they got DBQL information or an old Explain for this query, then you can spot differences and start investigating.