Percentile calculation spooling out

General
Fan

Percentile calculation spooling out

Hi,

Need to calculate percentile 90 in teradata and I am using the below query, it's working for 100 Million records, but spooling out when the volume increases. we are expecting 1Billion records and need to tune or change the query to calculate P90 at overall level. Percentile 90 should be calculated on TRAN_AM at all the records level. We are using TD14.

 

SyntaxEditor Code Snippet

 SELECT
  CAST(1 AS BYTEINT) AS DUMMY_ID,
  TRAN_AM AS PCT90_AMT
FROM DBNAME_TEST.TRANS_AMT
WHERE     TRAN_DATE BETWEEN DATE '2016-02-01' AND DATE '2016-03-30'
QUALIFY  ROW_NUMBER() OVER (PARTITION BY DUMMY_ID ORDER BY PST_AM)
  = CAST(0.90 * COUNT(*) OVER (PARTITION BY DUMMY_ID) + 0.999999999999999 AS INT)

 

2 REPLIES
Highlighted
Teradata Employee

Re: Percentile calculation spooling out

Why introduce a constant PARTITION BY value?

 

 SELECT
  TRAN_AM AS PCT90_AMT
FROM DBNAME_TEST.TRANS_AMT
WHERE     TRAN_DATE BETWEEN DATE '2016-02-01' AND DATE '2016-03-30'
QUALIFY  ROW_NUMBER() OVER (ORDER BY PST_AM)
  = CAST(0.90 * COUNT(*) OVER () + 0.999999999999999 AS INT)
Fan

Re: Percentile calculation spooling out

I think it's not needed. Is there any other way to calculate percentile without ranking all the records ?