qualify rank() over (partition) - question

Database

qualify rank() over (partition) - question

Hello, i need some help please,

I have a dataset similar to the following:

inr rd dt    c1 c2

1 1 '2000-01-01 10:10:10 1111' 1 1

1 1 '2000-01-01 10:10:10 2222' 2 2

1 1 '2000-01-01 10:10:10 3333' 3 3

1 1 '2000-01-01 10:10:10 3333' 4 4

1 2 '2000-01-01 10:10:10 1111' 1 1

1 2 '2000-01-01 10:10:10 2222' 2 2

1 2 '2000-01-01 10:10:10 3333' 3 3

1 2 '2000-01-01 10:10:10 4444' 4 4

1 1 '2000-01-01 10:10:10 1111' 1 1

1 1 '2000-01-01 10:10:10 2222' 2 2

1 1 '2000-01-01 10:10:10 3333' 3 3

1 1 '2000-01-01 10:10:10 3333' 3 3

The result set should look like the following:

inr rd dt    c1 c2

1 1 '2000-01-01 10:10:10 3333' 4 4

1 2 '2000-01-01 10:10:10 4444' 4 4

1 1 '2000-01-01 10:10:10 3333' 3 3

Any suggestions would be appreciated.

7 REPLIES
N/A

Re: qualify rank() over (partition) - question

It's hard to guess without any details about the rules you want to implement.

select * from tab
qualify rank() over (partition by inr, rd order by dt desc, ct1 desc, ct2 desc) = 1

Dieter

Re: qualify rank() over (partition) - question

Sorry for confusion..

I have data like below..

here I want to pick one row for each inr & rd for latest timestamp(dt col)..

if we have multiple rows for latest timestamp for that inr & rd then I want to get the latest row id rows for that latest(max) dt for each inr & rd...

like below highlighted rows are my output for this scenario

table contains 5+ billion rows, 100+ columns, for specific process it will fetch only 1MM rows

any suggestion in performance stand point..








inr rd dt c1 c2 c100
1 1 2000-01-01 10:10:10 1111 1 1  
1 1 2000-01-01 10:10:10 2222 2 2  
1 1 2000-01-01 10:10:10 3333 3 3  
1 1 2000-01-01 10:10:10 4444 4 4  
1 2 2000-01-01 10:10:10 1111 1 1  
1 2 2000-01-01 10:10:10 4444 4 4  
1 2 2000-01-01 10:10:10 4444 5 5  
1 2 2000-01-01 10:10:10 4444 5 5  
1 2 2000-01-01 10:10:10 2222 2 2  
1 2 2000-01-01 10:10:10 3333 3 3  
1 3 2000-01-01 10:10:10 1111 1 1  
1 3 2000-01-01 10:10:10 3333 3 3  
1 3 2000-01-01 10:10:10 3333 4 4  
1 3 2000-01-01 10:10:10 3333 5 5  
1 3 2000-01-01 10:10:10 2222 2 2  

Re: qualify rank() over (partition) - question

Sorry for confusion..

I have data like below..

here I want to pick one row for each inr & rd for latest timestamp(dt col)..

if we have multiple rows for latest timestamp for that inr & rd then I want to get the latest row id rows for that latest(max) dt for each inr & rd...

like below highlighted rows are my output for this scenario

table contains 5+ billion rows, 100+ columns, for specific process it will fetch only 1MM rows

PI for inr & rd columns, PPI for inr column

any suggestion in performance stand point..

inr rd dt (date & timestamp upto micro sec)  c1 c2 … c100

1   1  2000-01-01 10:10:10 1111              1  1 … 

1   1  2000-01-01 10:10:10 2222              2  2 … 

1   1  2000-01-01 10:10:10 3333              3  3 … 

1   1  2000-01-01 10:10:10 4444              4  4 … 

1   2  2000-01-01 10:10:10 1111              1  1 … 

1   2  2000-01-01 10:10:10 4444              4  4 … 

1   2  2000-01-01 10:10:10 4444              5  5 … 

1   2  2000-01-01 10:10:10 4444              5  5 … 

1   2  2000-01-01 10:10:10 2222              2  2 … 

1   2  2000-01-01 10:10:10 3333              3  3 … 

1   3  2000-01-01 10:10:10 1111              1  1 … 

1   3  2000-01-01 10:10:10 3333              4  4 … 

1   3  2000-01-01 10:10:10 3333              4  4 … 

1   3  2000-01-01 10:10:10 3333              5  5 … 

1   3  2000-01-01 10:10:10 2222              2  2 … 

Re: qualify rank() over (partition) - question

I am pasting the output here as above highlighted are not showing highlighted rows properly

1 1 2000-01-01 10:10:10 4444 4 4 …

1 2 2000-01-01 10:10:10 4444 5 5 …

1 3 2000-01-01 10:10:10 3333 5 5 …

N/A

Re: qualify rank() over (partition) - question

Did you try the query i posted, this should work as expected.

But performnace on a 5 billion rows table will be probably poor when you actually need all those 100 columns (all data will be put in spool twice). 

It might be better to run the QUALIFY on the neccessary columns and then join back to get the remaining columns.

Dieter

Re: qualify rank() over (partition) - question

Sorry for late response, I tried but it didn’t work as expected and expecting below

create

table ppi (a int, b date, c int, d int);

insert into ppi values (1, CURRENT_TIMESTAMP, 1,1);

insert into ppi values (1, CURRENT_TIMESTAMP, 1,2);

insert into ppi values (1, CURRENT_TIMESTAMP, 1,3);

insert into ppi values (2, CURRENT_TIMESTAMP, 1,3);

insert into ppi values (2, CURRENT_TIMESTAMP, 1,2);

insert into ppi values (2, CURRENT_TIMESTAMP, 1,1);

select * from ppi qualify rank() over (partition by a, c order by b desc) = 1;

a              b                          c              d

1              2013-07-05           1              1

1              2013-07-05           1              2

1              2013-07-05         1               3

2              2013-07-05           1              3

2              2013-07-05           1              2

2              2013-07-05          1              1

select * from ppi qualify rank() over (partition by a,c order by b desc, d desc) = 1;

a              b                          c              d

1              2013-07-05           1              3

2              2013-07-05           1              3

Below is expected output what I need if I use like qry 1 (table has c1 to c100 other than a,b,c,d, but those c1 to c100 I just want to carry forward for selected rows and i don’t want to add in the partition with desc to avoid performance issue), any suggestion

 

a              b                          c              d

1              2013-07-05           1              3

2              2013-07-05           1              1

Teradata Employee

Re: qualify rank() over (partition) - question

Hei man, let's try something similar..

sel * from ppi qualify  row_number()  over(order by a asc,  d desc)=1

union

sel * from ppi qualify  row_number()  over(order by a desc, d asc)=1

 bye ;)