Distinct V/s Group By

Database
Enthusiast

Distinct V/s Group By

I have a table with 20 million records ...
I am doing various join operation on these set of records to get some distinct values...

when i use:

sel distinct col1,col2..... col12 from tbl

i am getting the required result
however if i give

sel col1,col2..... col12
from tbl
Group by 1,2,3.....12

I am getting an error message : no more spool space.
But i have heard that Group By takes less space than Distinct.
Please let me know how actually these two are working.
3 REPLIES
Enthusiast

Re: Distinct V/s Group By

Hi Goutham,

SELECT DISTINCT x FROM TABLE1;
or
SELECT x FROM TABLE1 GROUP BY x;

Teradata uses two different methods to produce the result sets of above queries, However the performance varies dramatically depending on the data.

SELECT DISTINCT is designed for data sets that are nearly unique to begin with and works by sorting the entire intermediate spool file and discarding duplicate values.
SELECT/GROUP BY is designed for data sets that have relatively few unique values and works by performing an AMP local grouping operation and then merging the partial result sets for final processing.

Hope this clears your doubt.

Regards,
Balamurugan
Enthusiast

Re: Distinct V/s Group By

Please analyse the Explain Plan of both the queries to get more information.

I feel the Table could be heavily skewed.

Regards,
Annal T
Enthusiast

Re: Distinct V/s Group By

Got it ...
Thanks a lot .....