Help on Below SQL

Database
Enthusiast

Help on Below SQL

drop table retail.distance;

drop table  retail.distance1; 

create table retail.distance

 (

 source_Place varchar(20),

 destination_Place varchar(20),

 distance  integer

 );

 insert into retail.distance values ('Pune', 'Mumbai',100);

 insert into retail.distance values ('Mumbai', 'Pune',100);

 insert into retail.distance values ('Pune', 'Delhi',300);

 insert into retail.distance values ('Delhi', 'Pune',300);

 insert into retail.distance values ('Delhi', 'JAmmu',500);

 insert into retail.distance values ('Delhi', 'Agra',600);

 insert into retail.distance values ('Agra', 'Delhi',600);

insert into retail.distance values ('Agra', 'Bhopal',700);

/*

   source_Place destination_Place distance

1 Mumbai Pune 100

2 Pune Mumbai 100

3 Delhi Pune 300

4 Pune Delhi 300

5 Delhi JAmmu 500

6 Delhi Agra 600

7 Agra Delhi 600

8 Agra Bhopal 700

*/

Now in above result set we can see clearly row 1 & 2 are duplicate(i.e. same information in both row).

In same way row 3 & 4, and 6 & 7 are duplicate.

I have to remove these duplicate and want result set like below.

/*

source_Place destination_Place distance

1 Mumbai Pune 100

--2 Pune Mumbai 100

3 Delhi Pune 300

--4 Pune Delhi 300

5 Delhi JAmmu 500

6 Delhi Agra 600

--7 Agra Delhi 600

8 Agra Bhopal 700

*/

Desire output should have either row 1,3,5,6,8 or 2,4,5,7,8.

Could any of expert help me on above.

Tags (1)
3 REPLIES
Senior Apprentice

Re: Help on Below SQL

Hey, this is one of the labs I do in my trainings ;-)

There are several solutions to this problem, their performance mainly depends on the number of duplicates. Usually a good one is a ROW_NUMBER:

SELECT
source_Place,
destination_Place,
distance
FROM distance
QUALIFY
ROW_NUMBER()
OVER (PARTITION BY
CASE WHEN destination_Place < source_Place THEN destination_Place ELSE source_Place END,
CASE WHEN destination_Place < source_Place THEN source_Place ELSE destination_Place END
ORDER BY source_Place) = 1

Enthusiast

Re: Help on Below SQL

Thanks Dieter for your quick response...:)

Could you please below query once and Please suggest to optimize this query.

SELECT SOURCE_PLACE,DESTINATION_PLACE,distance 
FROM RETAIL.DISTANCE a
WHERE (SOURCE_PLACE,DESTINATION_PLACE)
NOT IN (
SELECT DESTINATION_PLACE,SOURCE_PLACE FROM RETAIL.DISTANCE b )

UNION

SELECT SOURCE_PLACE,DESTINATION_PLACE,distance
FROM RETAIL.DISTANCE a
WHERE (SOURCE_PLACE,DESTINATION_PLACE)
IN (
SELECT DESTINATION_PLACE,SOURCE_PLACE FROM RETAIL.DISTANCE b
where a.SOURCE_PLACE < b.SOURCE_PLACE );

Thanks,

Abhijeet

Senior Apprentice

Re: Help on Below SQL

Hi Abhijeet,

simply compare Explain and actual cpu/io in the query log.

Better performing than NOT IN/UNION should be this:

SELECT DISTINCT source_Place, destination_Place, distance
FROM distance t1
WHERE NOT EXISTS(
SELECT * FROM distance t2
WHERE t1.source_Place = t2.destination_Place
AND t1.destination_Place = t2.source_Place
AND t1.source_Place > t2.source_Place
);