What is mean by STAT FUNCTION in Teradata

Database
Enthusiast

What is mean by STAT FUNCTION in Teradata

Hello Experts,

I am new to Teradata, Kindly let me know what is mean by STATS FUNCTION in teradata. Or what it does.

In the Explain plan of query, 

"We do an All-AMPs STAT FUNCTION step from Spool 1475 (Last Use) by way of an all-rows scan into Spool 1484, which is redistributed by hash code to all AMPs. The result rows are put into Spool 1482, which is redistributed by hash code to all AMPs."

What its exactly doing in this step. Due to this my query is stuck in this step.

Urgent help required.

Thanks,

Teradata_User1

9 REPLIES
Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

Hi,

Please check below link :

http://teradatablog.wordpress.com/2011/09/15/stat-function-in-explain-plan/

Try to use sample in place of TOP .

SELECT * FROM table SAMPLE 10;

Thanks

Nagendra

Junior Contributor

Re: What is mean by STAT FUNCTION in Teradata

What's the remaining text of this step?

STAT FUCTION step might be TOP or a Windowed Aggregate Function like RANK or SUM OVER.

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

The query is stucked in 7th step. It is not executing. I am unable to get the output.

Kindly suggest how to overcome from STATS Function.

The query is....

select distinct sum(D1.c1) over (partition by D1.c2)  as c1,

     D1.c2 as c2,

     D1.c3 as c3,

     D1.c4 as c4,

     D1.c5 as c5,

     D1.c6 as c6,

     D1.c7 as c7,

     D1.c8 as c8,

     D1.c9 as c9,

     sum(D1.c10) over (partition by D1.c11, D1.c2)  as c10,

     D1.c11 as c11

from 

     (select count(distinct case D1.c16 when 1 then D1.c17 else NULL end ) as c1,

               D1.c2 as c2,

               count(distinct D1.c17) as c3,

               sum(D1.c19) as c4,

               sum(D1.c20) as c5,

               sum(D1.c23) as c6,

               sum(D1.c24) as c7,

               D1.c8 as c8,

               D1.c9 as c9,

               count(distinct case D1.c25 when 1 then D1.c17 else NULL end ) as c10,

               D1.c11 as c11

          from 

               (select case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end  as c2,

                         T62659."DAY_OF_MONTH" as c8,

                         substring(T62659."MONTH_NAME"  from 1, 3) as c9,

                         case  when T85462."X_SR_CAT_TYPE_CD" in ('Accident', 'Break-down') then 'Accidented' else 'Normal' end  as c11,

                         T61292."ORDER_TYPE" as c12,

                         T61292."ORDER_DT" as c13,

                         T61292."X_CLS_DT" as c14,

                         T12426."ORDER_WID" as c15,

                         ROW_NUMBER() OVER (PARTITION BY case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end , case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  ORDER BY case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end  DESC, case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  DESC) as c16,

                         case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  as c17,

                         T61292."STATUS_CD" as c18,

                         case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."STATUS_CD" = 'Closed' then 1 else 0 end  as c19,

                         case  when T61292."ORDER_TYPE" = 'Service Order' and not T61292."STATUS_CD" in ('Cancel', 'Cancelled') then 1 else 0 end  as c20,

                         T61292."X_ATTRIB_27" as c21,

                         T61292."X_ORD_DT_CREATED" as c22,

                         case  when ( CAST((( cast(( T61292."X_ORD_DT_CREATED" + CAST( 330 AS INTERVAL MINUTE(4) ) ) as  TIMESTAMP  )  -  T61292."X_ATTRIB_27" ) DAY(4)) AS INTEGER) ) = 0 and T61292."ORDER_TYPE" = 'Service Order' and T61292."STATUS_CD" = 'Closed' then 1 else 0 end  as c23,

                         case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."STATUS_CD" = 'Parts Unavailable' then 1 else 0 end  as c24,

                         ROW_NUMBER() OVER (PARTITION BY case  when T85462."X_SR_CAT_TYPE_CD" in ('Accident', 'Break-down') then 'Accidented' else 'Normal' end , case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end , case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  ORDER BY case  when T85462."X_SR_CAT_TYPE_CD" in ('Accident', 'Break-down') then 'Accidented' else 'Normal' end  DESC, case  when cast(T62659."DAY_DT" as  DATE  ) > cast(DATE '2014-06-30' as  DATE  ) then cast(T62659."DAY_DT" as  DATE  ) else cast(DATE '2014-06-30' as  DATE  ) end  DESC, case  when T61292."ORDER_TYPE" = 'Service Order' and T61292."ORDER_DT" <= DATE '2014-06-30' and (T61292."X_CLS_DT" > DATE '2014-06-30' or T61292."X_CLS_DT" is null) then T12426."ORDER_WID" end  DESC) as c25

                    from 

                         "W_DAY_D" T62659 /* Generic Date (W_DAY_D) */ ,

                         "W_ORDER_F" T12426,

                         "W_ORDER_D" T61292 /* Job Card (W_ORDER_D) */ ,

                         "WC_SRVREQ_EXTERNAL_D" T85462

                    where  ( T12426."ORDER_DT_WID" = T62659."ROW_WID" and T12426."ORDER_WID" = T61292."ROW_WID" and T12426."SR_WID" = T85462."ROW_WID" ) 

               ) D1

          group by D1.c2, D1.c8, D1.c9, D1.c11

     ) D1

=================================================

Explain plan...

1) First, we lock PROD_OLAP.W_ORDER_D_PDM for access, we lock

     PROD_OLAP.WC_SRVREQ_EXTERNAL_D for access, we lock

     PROD_OLAP.W_ORDER_F_PDM for access, and we lock PROD_OLAP.W_DAY_D

     for access. 

  2) Next, we execute the following steps in parallel. 

       1) We do an all-AMPs RETRIEVE step from

          PROD_OLAP.WC_SRVREQ_EXTERNAL_D by way of an all-rows scan

          with no residual conditions into Spool 4 (all_amps)

          (compressed columns allowed) fanned out into 7 hash join

          partitions, which is built locally on the AMPs.  The input

          table will not be cached in memory, but it is eligible for

          synchronized scanning.  The size of Spool 4 is estimated with

          high confidence to be 56,229,300 rows (1,855,566,900 bytes). 

          The estimated time for this step is 1 minute and 5 seconds. 

       2) We do an all-AMPs RETRIEVE step from PROD_OLAP.W_ORDER_F_PDM

          by way of an all-rows scan with no residual conditions into

          Spool 5 (all_amps) (compressed columns allowed) fanned out

          into 7 hash join partitions, which is redistributed by the

          hash code of (PROD_OLAP.W_ORDER_F_PDM.SR_WID) to all AMPs. 

          The input table will not be cached in memory, but it is

          eligible for synchronized scanning.  The size of Spool 5 is

          estimated with high confidence to be 84,522,879 rows (

          3,127,346,523 bytes).  The estimated time for this step is 6

          minutes and 9 seconds. 

       3) We do an all-AMPs RETRIEVE step from PROD_OLAP.W_DAY_D by way

          of an all-rows scan with no residual conditions into Spool 6

          (all_amps) (compressed columns allowed), which is duplicated

          on all AMPs.  The size of Spool 6 is estimated with high

          confidence to be 2,732,112 rows (120,212,928 bytes).  The

          estimated time for this step is 0.17 seconds. 

  3) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an

     all-rows scan, which is joined to Spool 5 (Last Use) by way of an

     all-rows scan.  Spool 4 and Spool 5 are joined using a hash join

     of 7 partitions, with a join condition of ("SR_WID = ROW_WID"). 

     The result goes into Spool 7 (all_amps) (compressed columns

     allowed), which is built locally on the AMPs.  The size of Spool 7

     is estimated with low confidence to be 84,522,879 rows (

     3,465,438,039 bytes).  The estimated time for this step is 20.17

     seconds. 

  4) We do an all-AMPs RETRIEVE step from PROD_OLAP.W_ORDER_D_PDM by

     way of an all-rows scan with no residual conditions into Spool 8

     (all_amps) (compressed columns allowed) fanned out into 24 hash

     join partitions, which is built locally on the AMPs.  The input

     table will not be cached in memory, but it is eligible for

     synchronized scanning.  The size of Spool 8 is estimated with high

     confidence to be 84,484,273 rows (7,181,163,205 bytes).  The

     estimated time for this step is 1 minute and 37 seconds. 

  5) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an

     all-rows scan, which is joined to Spool 7 (Last Use) by way of an

     all-rows scan.  Spool 6 and Spool 7 are joined using a single

     partition hash join, with a join condition of ("ORDER_DT_WID =

     ROW_WID").  The result goes into Spool 9 (all_amps) (compressed

     columns allowed), which is redistributed by the hash code of (

     PROD_OLAP.W_ORDER_F_PDM.ORDER_WID) to all AMPs into 24 hash join

     partitions.  The size of Spool 9 is estimated with low confidence

     to be 84,522,879 rows (4,733,281,224 bytes).  The estimated time

     for this step is 12.37 seconds. 

  6) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an

     all-rows scan, which is joined to Spool 9 (Last Use) by way of an

     all-rows scan.  Spool 8 and Spool 9 are joined using a hash join

     of 24 partitions, with a join condition of ("ORDER_WID = ROW_WID"). 

     The result goes into Spool 3 (all_amps) (compressed columns

     allowed), which is built locally on the AMPs.  The size of Spool 3

     is estimated with low confidence to be 84,522,879 rows (

     10,311,791,238 bytes).  The estimated time for this step is 14.99

     seconds. 

  7) We do an all-AMPs STAT FUNCTION step from Spool 3 (Last Use) by

     way of an all-rows scan into Spool 12 (Last Use), which is

     redistributed by hash code to all AMPs.  The result rows are put

     into Spool 10 (all_amps) (compressed columns allowed), which is

     built locally on the AMPs.  The size is estimated with low

     confidence to be 84,522,879 rows (6,508,261,683 bytes). 

  8) We do an all-AMPs STAT FUNCTION step from Spool 10 (Last Use) by

     way of an all-rows scan into Spool 15 (Last Use), which is

     redistributed by hash code to all AMPs.  The result rows are put

     into Spool 14 (all_amps) (compressed columns allowed), which is

     built locally on the AMPs.  The size is estimated with low

     confidence to be 84,522,879 rows (10,565,359,875 bytes). 

  9) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by way of

     an all-rows scan into Spool 1 (used to materialize view, derived

     table or table function D1) (all_amps) (compressed columns

     allowed), which is built locally on the AMPs.  The size of Spool 1

     is estimated with low confidence to be 84,522,879 rows (

     4,817,804,103 bytes).  The estimated time for this step is 5.48

     seconds. 

 10) We do an all-AMPs SUM step to aggregate from Spool 1 by way of an

     all-rows scan , grouping by field1 ( D1.C2 ,D1.C8 ,D1.C9 ,D1.C11

     ,( CASE WHEN (D1.C25 = 1) THEN (( CASE WHEN ((ORDER_TYPE =

     'Service Order') AND (((CAST((ORDER_DT) AS DATE))<= DATE

     '2014-06-30') AND (((CAST((PROD_OLAP.W_ORDER_D_PDM.X_CLS_DT) AS

     DATE))> DATE '2014-06-30') OR (PROD_OLAP.W_ORDER_D_PDM.X_CLS_DT IS

     NULL )))) THEN (PROD_OLAP.W_ORDER_F_PDM.ORDER_WID) ELSE (NULL) END

     )) ELSE (NULL) END)).  Aggregate Intermediate Results are computed

     globally, then placed in Spool 24.  The size of Spool 24 is

     estimated with low confidence to be 84,484,203 rows (

     7,688,062,473 bytes).  The estimated time for this step is 54.89

     seconds. 

 11) We do an all-AMPs SUM step to aggregate from Spool 24 (Last Use)

     by way of an all-rows scan, and the grouping identifier in field 1. 

     Aggregate Intermediate Results are computed globally, then placed

     in Spool 26.  The size of Spool 26 is estimated with low

     confidence to be 37,946 rows (3,604,870 bytes).  The estimated

     time for this step is 4.70 seconds. 

 12) We execute the following steps in parallel. 

      1) We do an all-AMPs RETRIEVE step from Spool 26 (Last Use) by

         way of an all-rows scan into Spool 21 (all_amps) (compressed

         columns allowed), which is built locally on the AMPs.  The

         size of Spool 21 is estimated with low confidence to be 37,946

         rows (2,162,922 bytes).  The estimated time for this step is

         0.03 seconds. 

      2) We do an all-AMPs SUM step to aggregate from Spool 1 by way of

         an all-rows scan, and the grouping identifier in field 1. 

         Aggregate Intermediate Results are computed globally, then

         placed in Spool 30.  The size of Spool 30 is estimated with

         low confidence to be 84,484,203 rows (7,350,125,661 bytes). 

         The estimated time for this step is 46.98 seconds. 

 13) We do an all-AMPs SUM step to aggregate from Spool 30 (Last Use)

     by way of an all-rows scan, and the grouping identifier in field 1. 

     Aggregate Intermediate Results are computed globally, then placed

     in Spool 32.  The size of Spool 32 is estimated with low

     confidence to be 37,946 rows (2,997,734 bytes).  The estimated

     time for this step is 4.03 seconds. 

 14) We execute the following steps in parallel. 

      1) We do an all-AMPs RETRIEVE step from Spool 32 (Last Use) by

         way of an all-rows scan into Spool 20 (all_amps) (compressed

         columns allowed), which is built locally on the AMPs.  The

         size of Spool 20 is estimated with low confidence to be 37,946

         rows (1,555,786 bytes).  The estimated time for this step is

         0.03 seconds. 

      2) We do an all-AMPs SUM step to aggregate from Spool 1 (Last

         Use) by way of an all-rows scan, and the grouping identifier

         in field 1.  Aggregate Intermediate Results are computed

         globally, then placed in Spool 36.  The size of Spool 36 is

         estimated with low confidence to be 84,484,203 rows (

         7,350,125,661 bytes).  The estimated time for this step is

         46.98 seconds. 

 15) We do an all-AMPs SUM step to aggregate from Spool 36 (Last Use)

     by way of an all-rows scan, and the grouping identifier in field 1. 

     Aggregate Intermediate Results are computed globally, then placed

     in Spool 38.  The size of Spool 38 is estimated with low

     confidence to be 37,946 rows (2,997,734 bytes).  The estimated

     time for this step is 4.03 seconds. 

 16) We do an all-AMPs RETRIEVE step from Spool 38 (Last Use) by way of

     an all-rows scan into Spool 19 (all_amps) (compressed columns

     allowed), which is built locally on the AMPs.  The size of Spool

     19 is estimated with low confidence to be 37,946 rows (1,555,786

     bytes).  The estimated time for this step is 0.03 seconds. 

 17) We execute the following steps in parallel. 

      1) We do an all-AMPs RETRIEVE step from Spool 19 (Last Use) by

         way of an all-rows scan into Spool 40 (all_amps) (compressed

         columns allowed), which is redistributed by hash code to all

         AMPs.  Then we do a SORT to order Spool 40 by row hash.  The

         size of Spool 40 is estimated with low confidence to be 37,946

         rows (1,252,218 bytes).  The estimated time for this step is

         0.03 seconds. 

      2) We do an all-AMPs RETRIEVE step from Spool 20 (Last Use) by

         way of an all-rows scan into Spool 41 (all_amps) (compressed

         columns allowed), which is redistributed by hash code to all

         AMPs.  Then we do a SORT to order Spool 41 by row hash.  The

         size of Spool 41 is estimated with low confidence to be 37,946

         rows (1,252,218 bytes).  The estimated time for this step is

         0.03 seconds. 

 18) We execute the following steps in parallel. 

      1) We do an all-AMPs JOIN step from Spool 40 (Last Use) by way of

         a RowHash match scan, which is joined to Spool 41 (Last Use)

         by way of a RowHash match scan.  Spool 40 and Spool 41 are

         joined using a merge join, with a join condition of ("(C11 =

         C11) AND ((C9 = C9) AND ((C8 = C8) AND (C2 = C2 )))").  The

         result goes into Spool 42 (all_amps) (compressed columns

         allowed), which is built locally on the AMPs.  The size of

         Spool 42 is estimated with low confidence to be 37,255 rows (

         1,378,435 bytes).  The estimated time for this step is 0.04

         seconds. 

      2) We do an all-AMPs RETRIEVE step from Spool 21 (Last Use) by

         way of an all-rows scan into Spool 43 (all_amps) (compressed

         columns allowed), which is redistributed by hash code to all

         AMPs.  Then we do a SORT to order Spool 43 by row hash.  The

         size of Spool 43 is estimated with low confidence to be 37,946

         rows (1,859,354 bytes).  The estimated time for this step is

         0.03 seconds. 

 19) We do an all-AMPs JOIN step from Spool 42 (Last Use) by way of a

     RowHash match scan, which is joined to Spool 43 (Last Use) by way

     of a RowHash match scan.  Spool 42 and Spool 43 are joined using a

     merge join, with a join condition of ("(C11 = C11) AND ((C9 = C9)

     AND ((C8 = C8) AND (C2 = C2 )))").  The result goes into Spool 2

     (used to materialize view, derived table or table function D1)

     (all_amps) (compressed columns allowed), which is built locally on

     the AMPs.  The size of Spool 2 is estimated with low confidence to

     be 37,255 rows (2,421,575 bytes).  The estimated time for this

     step is 0.04 seconds. 

 20) We do an all-AMPs STAT FUNCTION step from Spool 2 (Last Use) by

     way of an all-rows scan into Spool 48 (Last Use), which is

     redistributed by hash code to all AMPs.  The result rows are put

     into Spool 46 (all_amps) (compressed columns allowed), which is

     built locally on the AMPs.  The size is estimated with low

     confidence to be 37,255 rows (2,421,575 bytes). 

 21) We do an all-AMPs STAT FUNCTION step from Spool 46 (Last Use) by

     way of an all-rows scan into Spool 51 (Last Use), which is

     redistributed by hash code to all AMPs.  The result rows are put

     into Spool 44 (group_amps), which is redistributed by hash code to

     all AMPs.  Then we do a SORT to order Spool 44 by the sort key in

     spool field1 eliminating duplicate rows.  The size is estimated

     with low confidence to be 37,255 rows (5,662,760 bytes). 

 22) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 44 are sent back to the user as the result

     of statement 1. 

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

Has this query run ok before?

I can't duplicate this query because of it's size,  but one thing to try is to remove the distinct clause on the aggregates and replace with a group by on the non aggregate columns. Try explaining to see if the plan is better.

Rglass

Junior Contributor

Re: What is mean by STAT FUNCTION in Teradata

Step 7 is one of the ROW_NUMBERs, both are a bit strange as they PARTITION and ORDER BY the same columns. You might simply change the ORDER BY to ORDER BY 1 instead.

And you might not need them at all, because they're only used in the outer COUNT(DISTINCT...) and you dont need them IMHO, try count(distinct D1.c17), this should return the same result.

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

Hello Dnoeth,

Do you think here I am missing any stats ?? I checked all the stats are up to date.

Do I need to change the query ? or without changing the query we can tune this query. ?

Previously this query was running fine. Plz help me to tune this query.

Regards,

Teradata_User1

Junior Contributor

Re: What is mean by STAT FUNCTION in Teradata

What do you mean by "Previously this query was running fine"?

Exactly the same query, similar data?

Can you check if DBQL step/explain information is available for the previous run?

Is the estimated number of rows close to the actual number?

What's the size of your system and what's your TD release?

Enthusiast

Re: What is mean by STAT FUNCTION in Teradata

As per our developer team, this query is not new in the system. They are using this query from long back.

TD release is 13.10 and total size of system is 36 TB. Out of 36TB, 11 TB is in used.

Junior Contributor

Re: What is mean by STAT FUNCTION in Teradata

Ask the developer team if they got DBQL information or an old Explain for this query, then you can spot differences and start investigating.