Need help on using Median on more than one column

Database
Enthusiast

Need help on using Median on more than one column

Hello,

 

I have a table in the below format:

        

ID Value 1 Value 2 Value 3 Value 4

15241
2101240
36291
41522407
55942

 

 

How can I use the median function on more than 1 column?

 

For e.g., median(value1, value2, value3, value4) wont work.

 


Accepted Solutions
Highlighted
Junior Contributor

Re: Need help on using Median on more than one column

-- Cross Joined for normalization 
CREATE TABLE OneToFour(i INT);
INSERT INTO OneToFour(1);
INSERT INTO OneToFour(2);
INSERT INTO OneToFour(3);
INSERT INTO OneToFour(4);

SELECT id, Avg(val)
FROM
 (
   SELECT
      id
     ,CASE i -- normalize columns to rows 
         WHEN 1 THEN c1
         WHEN 2 THEN c2
         WHEN 3 THEN c3
         WHEN 4 THEN c4
      END AS val
   FROM vt CROSS JOIN OneToFour 
   QUALIFY Row_Number() Over (PARTITION BY id ORDER BY val) BETWEEN 2 AND 3 
 ) AS dt
GROUP BY id

But for exactly four columns you can do a simple:

Average all values minus the least and the greatest

(c1 +c2 +c3 +c4 -Greatest(c1,c2,c3,c4) -Least(c1,c2,c3,c4))/2.0

 

 

1 ACCEPTED SOLUTION
7 REPLIES
Junior Contributor

Re: Need help on using Median on more than one column

Which result do you expect for that calculation?

Enthusiast

Re: Need help on using Median on more than one column

Median function works like below,

If we consider row 1, there are four values in it 1 2 4 5 (ordered ascending).
Median would be 2 + 4= 6/2=3

The existing median function works for a single column at a time just like any other aggregate function. I want to find out a way, median can be calculated across 4 columns at a time for each ID.

Least and Greatest can be used instead of Min and Max in such cases. I want a similar result but for median.


Thanks
Junior Contributor

Re: Need help on using Median on more than one column

Well, Median is an aggregate function, if you want to calculate it on a denormalized table you should: #1 normalize it, #2 apply a ROW_NUMBER and then #3 average the values of rownumber 2&3.

 

Btw, is the calculation actually based on 4 columns?

Enthusiast

Re: Need help on using Median on more than one column

Thanks ..But can u elaborate a little more..With an example maybe .?
Yes the calculation is based only on 4 columns
Enthusiast

Re: Need help on using Median on more than one column

Row number is row wise right? I need the average of the 2 sorted column values..
Highlighted
Junior Contributor

Re: Need help on using Median on more than one column

-- Cross Joined for normalization 
CREATE TABLE OneToFour(i INT);
INSERT INTO OneToFour(1);
INSERT INTO OneToFour(2);
INSERT INTO OneToFour(3);
INSERT INTO OneToFour(4);

SELECT id, Avg(val)
FROM
 (
   SELECT
      id
     ,CASE i -- normalize columns to rows 
         WHEN 1 THEN c1
         WHEN 2 THEN c2
         WHEN 3 THEN c3
         WHEN 4 THEN c4
      END AS val
   FROM vt CROSS JOIN OneToFour 
   QUALIFY Row_Number() Over (PARTITION BY id ORDER BY val) BETWEEN 2 AND 3 
 ) AS dt
GROUP BY id

But for exactly four columns you can do a simple:

Average all values minus the least and the greatest

(c1 +c2 +c3 +c4 -Greatest(c1,c2,c3,c4) -Least(c1,c2,c3,c4))/2.0

 

 

Enthusiast

Re: Need help on using Median on more than one column

 

Awesome ! Thanks a lot Dieter :)

Such a simple yet effective way...Now I have started to scratch my head and wonder why couldnt I think of it...

I am new to the group..I guess I am gonna bug u guys a lot more ;)