Union vs Group by

Database
Junior Supporter

Union vs Group by

Hi,

I have a question for Union vs group by.

I need to remove duplicates from financial.checking_tran using union.

sel tran_id,cust_id,acct_nbr from 

financial.checking_tran

union  

sel '0' as tran_id,'0' as cust_id, max(1) as acct_nbr from 

financial.checking_tran

group by 1,2

The other option is using:


sel tran_id,cust_id,acct_nbr, count(*) as mcount from 

financial.checking_tran

group by 1,2,3

where mcount<=1

From a performnce point, which is better ?

3 REPLIES
Senior Supporter

Re: Union vs Group by

The standard answer is that it can depend on your data, the PI and configuration.

So easiest is to run the different options and check the DBQL.

You also forgot the set option of a temp table. Or the row_number() over...=1 posibility.

In case you have problems to interpret the DBQL numbers share the results.

Teradata Employee

Re: Union vs Group by

If you want to find the unique set of <tran_id,cust_id,acct_nbr> then use DISTINCT.

SELECT DISTINCT tran_id,cust_id,acct_nbr FROM financial.checking_tran;

Re: Union vs Group by

GROUP BY is required if you're aggregating data, but in many cases, DISTINCT is simpler to write and read if you aren't aggregating data.