finding links between two col values..

Database
BPP
Enthusiast

finding links between two col values..

from below data set

Cola Colb

1 20

20 31

31 40

40 5

10 11

10 120

120 13

100 201

201 210

201 103

1000 465

I would like know that values 1,20,31,40 and 5 are related.  100,201,210 and 103 are related and so on.  How do I do this in sql?? thanks in advacne..

8 REPLIES
Enthusiast

Re: finding links between two col values..

Will this be ok for you?

select a.* from test_table1 a, test_table1 b where a.cola=b.colb

Enthusiast

Re: finding links between two col values..

Hi

In my understanding you trying to find the parent-child relation. Look for Recursive Query (using Level in Root).

Thanking You

Santanu

BPP
Enthusiast

Re: finding links between two col values..

Santanu84: thought about it but dont know where I am going with it..  If someone can play with this code and try it out that would be very helpful...

CREATE VOLATILE TABLE val

(col1 INTEGER,

col2 INTEGER

)

PRIMARY INDEX(col1, col2)

ON COMMIT PRESERVE ROWS

DEL FROM val;

insert into val( 1 , 20 );

insert into val( 20 , 31 );

insert into val( 31 , 40 );

insert into val( 40 , 5 );

insert into val( 10 , 11 );

insert into val( 10 , 120 );

insert into val( 120 , 13 );

insert into val( 100 , 201 );

insert into val( 201 , 210 );

insert into val( 201 , 103 );

insert into val( 1000 , 465 );

WITH RECURSIVE match(col1,col2,levl) AS

(

SELECT col1,col2,0 lvl

FROM val

UNION ALL 

SELECT match.col1, val.col2, match.levl+1

FROM val, match

WHERE match.col1=val.col2 AND match.levl<10


SELECT DISTINCT col1, col2 FROM match ORDER BY 1,2;
Junior Contributor

Re: finding links between two col values..

You just have to find the root level, then i's easy :-)

WITH RECURSIVE match(grp, col1, col2, levl) AS
(
SELECT col1, col1, col2, 0 AS levl
FROM val AS t1
WHERE NOT EXISTS
(
SELECT * FROM val AS t2
WHERE t2.col2 = t1.col1
)

UNION ALL

SELECT match.grp, val.col1, val.col2, levl + 1
FROM val JOIN match
ON match.col2 = val.col1
WHERE levl <= 10
)
SELECT *
FROM match
ORDER BY 1,2;

Hopefully there are no circular references in the data, better keep the filter on levl.

BPP
Enthusiast

Re: finding links between two col values..

thank you Dieter. grouping them is key for me and you made it look so simple. I am eliminating circular references, I do not need them.

This is one of the many knots I got to resolve.. thanks again!!  

Thanks Raja and Satanu_84.

BPP
Enthusiast

Re: finding links between two col values..

Dieter, may be I did not understand when you said circular reference and may be it found me :-((  

here is real dataset..  we got case 1) a= b and b = a and then  case 2) we got a = b, a= c , a=d , e=c, c=e.

in correlated query changed to exists then it is kind of working but grouping for each value.  Not exists is not workig as rows are filtered out.  

Any more thoughts?  thank you.. 

Insert into val( 531,691 , 1,997,292 );

 insert into val( 531691 , 1997292 ); 

 insert into val( 1515094 , 1515093 ); 

 insert into val( 1515094 , 5122156 ); 

 insert into val( 1515094 , 5122155 ); 

 insert into val( 1997292 , 5518168 ); 

 insert into val( 1997292 , 531691 ); 

 insert into val( 5518168 , 1997292 ); 

INSERT INTO val( 301 , 500 );

INSERT INTO val( 500 , 301 );

INSERT INTO val( 500 , 401 );

INSERT INTO val( 401 , 500 );

INSERT INTO val( 600 , 700 );

INSERT INTO val( 700 , 600 );

INSERT INTO val( 8000 , 700 );

INSERT INTO val( 700 , 8000 );

Junior Contributor

Re: finding links between two col values..

If there are cycles you need to detect them, the only way is to built a materialized path and check if the current value is already included. And if there's no way to find out the actual start values you must do all possibles paths and then find the minimum value:

WITH RECURSIVE match(grp, col1, col2, levl, PATH) AS
(
SELECT t1.col1, t1.col1, t1.col2, 0 AS levl,
'.' || CAST(t1.col1 AS VARCHAR(501)) || '.' AS PATH
FROM val AS t1

UNION ALL

SELECT match.grp, val.col1, val.col2, levl + 1,
PATH ||TRIM(val.col1) || '.' AS p
FROM val JOIN match
ON match.col2 = val.col1
AND levl < 50
WHERE match.PATH NOT LIKE '%.' || TRIM(val.col1) || '.%'
)
SELECT MIN(grp) AS grp, col1,col2, MAX(levl)
FROM match
GROUP BY col1,col2
ORDER BY 1,2;

Caution, CAST(t1.col1 AS VARCHAR(500)) should be large enough to hold the maximum path length, e.g. if there's a maximum of 50 values in a path and the maximum lengh of a value is 9 you need ((9+1) * 50)+1 = 501 characters and you must check for levl < 50.

Depending on the number of rows in the table and the maximum number of rows in a path this might result in a huge spool. In that case rewriting the recursion into a loop in a Stored Procedure might be helpful (you can filter after every loop) .

BPP
Enthusiast

Re: finding links between two col values..

Thanks Dieter.  Looks like it is grouping them correctly, I will have to test it out with our data.   Thanks for giving the directions If rows or its size would be a problem, I can try it out with a proc.  thanks once again.. you are simply briliant! :-)).