REGEXP_REPLACE Expression

Database
Enthusiast

REGEXP_REPLACE Expression

Hi,


I have comments data in a column like 'ADDED AB SEGMENT FROM *ABCDEF FOR BAG'.  I am trying to use REGEXP_REPLACE to return 'ADDED AB SEGMENT FROM *ZZZZZZ FOR BAG' instead as the common/same value in all the occurances of this comment in the column. Can you please help me on this? The *ABCDEF occurs usually in the same position for all comments in the column.


Thanks!

6 REPLIES
Teradata Employee

Re: REGEXP_REPLACE Expression

regexp_replace('ADDED AB SEGMENT FROM *ABCDEF FOR BAG', 'ABCDEF', 'ZZZZZZ')
returns:
ADDED AB SEGMENT FROM *ZZZZZZ FOR BAG


but: oreplace('ADDED AB SEGMENT FROM *ABCDEF FOR BAG', 'ABCDEF', 'ZZZZZZ')

does exactly the same thing. If you don't need regular expressions, oreplace would be more efficient.

Enthusiast

Re: REGEXP_REPLACE Expression

Thanks sir! this worked!

Enthusiast

Re: REGEXP_REPLACE Expression

I have a REGEXP_REPLACE written for the other values in the same column which is 

REGEXP_REPLACE(<my column>, ' *([0-9]{1,2}(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)|[0-9]{1,4}/[0-9]{1,2}(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)|[0-9]{1,2}(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)[0-9]{1,2}|[0-9]{1,2}(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC).|[0-9]{1,2}(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)[0-9]{1,4}|[0-9]{1,2}(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC) [0-9]{1,4}|[A-Z]{1,6}|[A-Z]{1,6} [0-9]{1,2}(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))$','',1,1,'i')

i need to combine/include the regular expression you had suggested i.e.regexp_replace('ADDED AB SEGMENT FROM *ABCDEF FOR BAG', 'ABCDEF', 'ZZZZZZ')  as well to my original regexp_replace...Can you please help me with that?

Teradata Employee

Re: REGEXP_REPLACE Expression

Among the many expressions in here is [A-Z]{1-6}, which means that it will erase the first 1 to 6 contiguous letters in the string if it doesn't find something like a date, right?  So I don't understand how you can use this on the same string.  But if this does work on the same string (perhaps I misinterpret the regex), then you might consider nesting the replaces.  For instance,

OReplace( Regexp_Replace( <my column>, ' *----long expression----',"",1,1,'i' ), 'ABCDEF', 'ZZZZZZ' )

Enthusiast

Re: REGEXP_REPLACE Expression

Thank you it worked. I have few values in my data like 'REVIEWED BY DATA ANALYST XXXXXXXXXXXX' and 'FF XX XXXXXXXXX'. I am trying to write a REGEXP_REPLACE for this to return just 'REVIEWED BY DATA ANALYST' and 'FF XX'. Just the 1st four words in thr first example and the first two words in the second example and ignore the rest. Can you please help?

Highlighted
Teradata Employee

Re: REGEXP_REPLACE Expression

To get the first two words of a string, you could start at the beginning of the string and take everything up to the second space in the string:

          substring( <column> from 1 for (instr(<column>, ' ',1,2) - 1 )

Or if you think of it as weeding out the extra X's:

          regexp_replace( <column>, 'X{3,}', '' )