Grouping identifier

Database
Enthusiast

Grouping identifier

Hi
I am running below query...
sel * from employee e1 where salary=(sel max(salary) from employee e2 where e1.deptno=e2.deptno)

EXPLAIN IS

3) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from PRAKHAR1.e1 by way of an
all-rows scan with no residual conditions into Spool 2
(all_amps), which is built locally on the AMPs. Then we do a
SORT to order Spool 2 by row hash. The size of Spool 2 is
estimated with low confidence to be 26 rows. The estimated
time for this step is 0.01 seconds.
2) We do an all-AMPs SUM step to aggregate from PRAKHAR1.e2 by
way of an all-rows scan with no residual conditions, and the
grouping identifier in field 1. Aggregate Intermediate
Results are computed globally, then placed in Spool 5. The
size of Spool 5 is estimated with no confidence to be 5 rows.
The estimated time for this step is 0.04 seconds.
4) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by way of
an all-rows scan into Spool 4 (all_amps), which is duplicated on
all AMPs. Then we do a SORT to order Spool 4 by row hash. The
size of Spool 4 is estimated with no confidence to be 10 rows.
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an
all-rows scan, which is joined to Spool 4 (Last Use) by way of an
all-rows scan. Spool 2 and Spool 4 are joined using an inclusion
merge join, with a join condition of ("(DeptNo = DeptNo) AND
(Salary = Field_3)"). The result goes into Spool 1 (group_amps),
which is built locally on the AMPs. The size of Spool 1 is
estimated with index join confidence to be 26 rows. The estimated
time for this step is 0.04 seconds.

My query is
1)What is grouping identifier 1 ??
2)in 5th step salary=field_3 ....how salary in spool 4 is named as filed_3??what are field 1 and 2.....

Please reply ASAP
2 REPLIES
Teradata Employee

Re: Grouping identifier

Hi prakhar,

1) In this explain plan grouping identifier 1 is e1.deptno
2) I am not too very much sure about naming convention of intermediate fields, but here field_3 represents salary out of spool 5.

when you run this query in TD 12 it gives you a clear idea of what is happening.

3)We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from SAMPLES.e1 by way of an
all-rows scan with no residual conditions into Spool 2
(all_amps), which is redistributed by the hash code of (
SAMPLES.e1.deptno, SAMPLES.e1.sal) to all AMPs. Then we do a
SORT to order Spool 2 by row hash. The size of Spool 2 is
estimated with low confidence to be 2 rows (52 bytes). The
estimated time for this step is 0.01 seconds.
2) We do an all-AMPs SUM step to aggregate from SAMPLES.e2 by
way of an all-rows scan with no residual conditions
, grouping by field1 ( SAMPLES.e2.deptno). Aggregate
Intermediate Results are computed globally, then placed in
Spool 5. The size of Spool 5 is estimated with no confidence
to be 2 rows (54 bytes). The estimated time for this step is
0.05 seconds.
4) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by way of
an all-rows scan into Spool 4 (all_amps), which is redistributed
by the hash code of (SAMPLES.e2.deptno, SAMPLES.e2.Field_3) to all
AMPs. Then we do a SORT to order Spool 4 by row hash. The size
of Spool 4 is estimated with no confidence to be 2 rows (38 bytes).
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an
all-rows scan, which is joined to Spool 4 (Last Use) by way of an
all-rows scan. Spool 2 and Spool 4 are joined using an inclusion
merge join, with a join condition of ("(deptno = deptno) AND (sal
= Field_3)"). The result goes into Spool 1 (group_amps), which is
built locally on the AMPs. The size of Spool 1 is estimated with
no confidence to be 2 rows (68 bytes). The estimated time for
this step is 0.06 seconds.

This also clears step 1, where it is redistributing based on deptno and sal.

Thanks
Ankit
Enthusiast

Re: Grouping identifier

Thanks Ankit...

It very much clears all doubts, i think i have to start using TD12 ASAP