Search for first substring in string with condition

Database
Enthusiast

Search for first substring in string with condition

Any help on string search. No problem if recursive. I do not know how to do the following.

I want to find the first position of “b” in the string where “b” is not between 2 “a”’s. The select statement I have made returns the value 6 for row nr (Seq=1). I want it to return the value 17, the first b outside the two “a”. Is this possible?

CREATE MULTISET VOLATILE TABLE T1 (Seq INTEGER, Str VARCHAR(100)) ON COMMIT PRESERVE ROWS;

INSERT INTO T1 VALUES (1,'a----b----a-----b----b------');

INSERT INTO T1 VALUES (2,'b----b----b------');

INSERT INTO T1 VALUES (3,'---b--a--b--a--b------');

INSERT INTO T1 VALUES (4,'--b----b----b------');

SELECT Seq,

       Str,

       INDEX(Str,'b') AS idx,

       CASE

         WHEN Str LIKE '%b%' AND NOT Str LIKE '%a%b%a%' THEN INDEX(Str,'b')

         ELSE INDEX(Str,'b')

       END

FROM T1;

DROP TABLE T1;

Peter Schwennesen

6 REPLIES
Supporter

Re: Search for first substring in string with condition

which version are you on?

Supporter

Re: Search for first substring in string with condition

performance depend on your data volumes but at least this does the trick with your data ;-)

I used the volatile table to make it a bit more clearer 

First splitt your data into chars and keep only  the 'a' and 'b'

create volatile table t2 as
(
select seq,
str,
id,
substr(str,id,1) as v
from t1
cross join (select current_date - calendar_date as id from sys_calendar.calendar where id between 1 and 100) i
where
id <= characters(str)
and v in ('a','b')
) with data
primary index (seq)
on commit preserve rows
;

the check for 'aba' with olap or keep the b if it is the fist

and finally aggregate everything to keep only the first.

select seq, str, min(id)
from
(
select seq,
str,
id,
v
from t2
qualify
(min(v) over (partition by seq order by id rows between 1 preceding and 1 preceding)
|| v
|| min(v) over (partition by seq order by id rows between 1 following and 1 following) <> 'aba'

and v = 'b'
) or (
v = 'b' and id = min(id) over (partition by seq)
)
) as t
group by 1,2;

a bit of bruce force but as mentioned it does the trick

Junior Contributor

Re: Search for first substring in string with condition

Hi Peter,

based on the data you provided this should work (i just use POSITION instead of INDEX):

SELECT Seq, Str,
CASE
WHEN POSITION('a' IN Str) = 0
OR POSITION('a' IN Str) > POSITION('b' IN Str)
OR Str NOT LIKE '%a%b%a%'
THEN POSITION('b' IN Str)
ELSE POSITION('b' IN SUBSTR(Str, POSITION('b' IN Str) + 1)) + POSITION('b' IN Str)
END
FROM T1

Of course your specs are a bit vague :-)

Are following cases possible and what's the expected result?

'aba--aba---b'

'abba---b'

'a-ab-a---b'

This might require recursion or maybe a regular expression in TD14 using REGEXP_INST.

Dieter

Enthusiast

Re: Search for first substring in string with condition

Hi, thanks for the answers. I look at it.

Yes everything is possible.

Problem is: find first "b" not between two "a". There therefore (“-“ just represent random number of char)

ababbb should return me 4

a-b-a-bbb should return 7

baba should return 1

The original problem is that I am trying to clean out comments from my code. So in real life I am not looking for A’s and B’s, but “/*”, and “*/” and “—“.

But I ran into the problem that in some places the /* do not represent a comment, but is part of a text string. Therefore:

SELECT ‘SOME TEXT’, /* this is …. Here I want to find the position of “/*”.

But:

SELECT ‘SOME /* OTHER TEXT’, /*

Here I should not consider the first /* as a comment but only the second one.

Junior Contributor

Re: Search for first substring in string with condition

Hi Peter,

in cases like this it's better to post the real life question instead of an overly simplified one :-)

Seems like you try to clean DBQL data from comments, this will be a tough one...

What's your TD release?

Do you have at least oTRANSLATE or oREPLACE installed on your system? 

Best case would probably be REGEXP_REPLACE, a regular expression should be able to do this task.

DIeter

Supporter

Re: Search for first substring in string with condition

I agree with Dieter. But I also don't think this will work within the DB - at least not for the big SQL's which spann multiple lines in qrylogSQL.

I use a java programm to do this which parses the SQL from left to right and checks what it find first (e.g. ',/*,--) and then checks for the corrosponding closing  part (',*/.\n).

Easy and efficient...