Distinct with order by taking long time to run

Database
Enthusiast

Distinct with order by taking long time to run

Hi,

We are facing an issue in reports.Its simple distinct with order by clause which is taking much more time versus simple distinct

 

select distinct "AR_DIM_ARNG"."SHARED_NATIONAL_CR_FLAG" "SNC_Flag"
from "RISK_VIEWS"."ARNG_BV" "AR_DIM_ARNG"
order by 1 asc;

 ---------------8 mins 

select distinct "AR_DIM_ARNG"."SHARED_NATIONAL_CR_FLAG" "SNC_Flag"
from "RISK_VIEWS"."ARNG_BV" "AR_DIM_ARNG"

--------4 secs.

Can anybody explain why we facing this and how to tune it

1 REPLY
Teradata Employee

Re: Distinct with order by taking long time to run

Do you have stats collected on the corresponding column in the underlying table?

 

There are broadly two ways to do a DISTINCT operation:

  • Get all the detail data, sort it, and then remove duplicates (best when there are few duplicates)
  • Do a GROUP BY aggregation step (best when there are many duplicates)

The ORDER BY forces a sort, and the optimizer mistakenly thinks it will be cheaper to use the first method since it has to sort anyway. (It isn't recognizing that there are relatively few distinct values / many duplicates.)