Using the rexexp_substr function in TD14

Database
Teradata Employee

Using the rexexp_substr function in TD14

Hi All,

I am trying to use the regexp_substr function in TD14 (Version 14.00.00.01).

I want to be able to return a specified group result (in the simple example below, I expect 'Alex' to be returned, specified by the subexpr_arg being set to 1. (i.e. $1)

select 'Alex 1Alex, Geddy 2Lee, Neil 3Peart' as thesql

,regexp_substr( thesql -- source_string

,'(\w+)\s+([0-9]+)\1' -- regexp_string

,1 -- position_arg

,1 -- occurrence_arg

,'i' -- match_arg (Ignore case)

,1 -- subexpr_arg

) as theresult;

According to the TD14 Manual (Page 786 of the SQL Functions, Operators, Expressions, and Predicates manual) indicates to me that this should work.

Subexpr_arg is defined as "a numeric argument (a nonnegative integer from 0 through 9). subexpr_arg indicates the subexpression in the pattern that REGEXP_SUBSTR is to return."

However, upon closer examination of the TD_SYSFNLIB database contents, I can't see an overload of the function that will let this syntax be used:

I can see the following function defintions:

REPLACE FUNCTION "TD_SYSFNLIB"."REGEXP_SUBSTR" (

"source" /*Data Type Not Available*/,

"pattern" /*Data Type Not Available*/)

RETURNS /*Data Type Not Available*/

SPECIFIC "regexp_substr2"

LANGUAGE C

NO SQL

PARAMETER STYLE I

DETERMINISTIC

CALLED ON NULL INPUT

EXTERNAL

;

REPLACE FUNCTION "TD_SYSFNLIB"."REGEXP_SUBSTR" (

"source" /*Data Type Not Available*/,

"pattern" /*Data Type Not Available*/,

"startPosition" NUMBER(*),

"occurrence" NUMBER(*))

RETURNS /*Data Type Not Available*/

SPECIFIC "regexp_substr4"

LANGUAGE C

NO SQL

PARAMETER STYLE I

DETERMINISTIC

CALLED ON NULL INPUT

EXTERNAL

;

REPLACE FUNCTION "TD_SYSFNLIB"."REGEXP_SUBSTR" (

"source" /*Data Type Not Available*/,

"pattern" /*Data Type Not Available*/,

"startPosition" NUMBER(*))

RETURNS /*Data Type Not Available*/

SPECIFIC "regexp_substr3"

LANGUAGE C

NO SQL

PARAMETER STYLE I

DETERMINISTIC

CALLED ON NULL INPUT

EXTERNAL

;

And

REPLACE FUNCTION "TD_SYSFNLIB"."REGEXP_SUBSTR" (

"source" /*Data Type Not Available*/,

"pattern" /*Data Type Not Available*/,

"startPosition" NUMBER(*),

"occurrence" NUMBER(*),

"matchOption" /*Data Type Not Available*/)

RETURNS /*Data Type Not Available*/

SPECIFIC "regexp_substr"

LANGUAGE C

NO SQL

PARAMETER STYLE I

DETERMINISTIC

CALLED ON NULL INPUT

EXTERNAL

;

Also, the following query returns no rows:

select *

from dbc.columns

where databasename = 'TD_SYSFNLIB'

and tablename = 'REGEXP_SUBSTR'

and columnname = 'subexpr_arg';

So, subexpr_arg does not exist as a parameter in the function in TD_SYSFNLILB.

Has anyone successfully used the regexp_substr function in TD14 to return a group (subexpr_arg) result?

5 REPLIES
Teradata Employee

Re: Using the rexexp_substr function in TD14

Sorry for the typo in teh herading of this.. should be regexp_substr.

Anyway, I have tried this in TD14 (14.00.00.17) now as well, still no sign of the function defined in the documentation!

Re: Using the rexexp_substr function in TD14

Hi Joe,

There was a discrepancy between the documentation and the actual code.  The fix is in the work.  In the meantime, regexp_replace can be used to simulate this by specifying the starting position and ending position in the pattern with the characters '^' and '$' respectively so that the entire matching string will be replaced by the desired group/sub expression.  Note that the group/sub expression must be preceded by a backlash.

Example:

regexp_replace( thesql -- source_string

                     ,'^SELECT\s+([\w]+)\.?([\w]+)(?:[\s])(?:AS)?\s+(\w+)\s+FROM\s+(\w+)\.(\w+)\s+(?:AS)?\s+\1$' --pattern

                     ,'\1'   --the desired group to be returned

                     ,1   -- position_arg

                     ,1   -- occurrence_arg

                     ,'i' -- match_arg (Ignore case)

                     ) as Theresult;

 

Enthusiast

Re: Using the rexexp_substr function in TD14

Looks like it has been 3+ years since the fix was in the works.  Has their been any known progress on the Capturing groups??

Senior Apprentice

Re: Using the rexexp_substr function in TD14

I just checked TD15.10, there's a UDF named TD_SYSFNLIB.REGEXP_SUBSTR_GPL with 6 parameters and #6 is named subexp.

Don't know when it was implemented and why it got a new name, strange.

Of course you could add a REGEXP_SUBSTR SQL UDF to syslib which simply calls this one.

Enthusiast

Re: Using the rexexp_substr function in TD14

Thanks Deiter!  You dont even know how exiciting this is this is.

 

For anyone that wants to see a working example or two:

 

select
'///R/ABC123///R/DEF456///' as string_to_search
,REGEXP_SUBSTR_GPL(string_to_search, '/r/([^/-]+)/*/r/([^/-]+)',1,1,'i',1) as group_1_string --Yields ABC123
,REGEXP_SUBSTR_GPL(string_to_search, '/r/([^/-]+)/*/r/([^/-]+)',1,1,'i',2) as group_2_string --Yields DEF456