Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Database
Enthusiast

Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Hi Folks,

 

Long time listener, first time caller...

 

I'm having some problems with a Regular Expression that works as expected on Oracle but is not matching on Teradata 15.10. I'm trying to match invalid Italian tax IDs where the user has tried to provide a dummy value consisting of the ISO country code ('IT') and a series of the same digit 10 or more times. For example, IT99999999990 is my test value. The code that I am having problems with is as follows:

 

REGEXP_SIMILAR('IT99999999990','^IT(\d)\1{9,}','i')

 

The regular express should, starting at the beginning of the string, match the 'IT' country code, check that the next character is a digit, and then that the next 9 or more digits are the same as the first digit.

 

This should return 1 (match) but instead it's returning 0 (no match). I'm therefore wondering if the syntax I'm using is not understood by Teradata.

 

Does anyone have any ideas where I am going wrong and how I should change my regex so that it matches the sample string?

13 REPLIES
Teradata Employee

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Apparently Teradata does not recognize the "\1" back-reference to the parenthetical expression.  I can't think of another way to express the pattern you describe without a complex (ugly) CASE operation.  I've even tried using '^IT(0{9,})|(1{9,})|(2{9,})|(3{9,})|(4{9,})|(5{9,})|(6{9,})|(7{9,})|(8{9,})|(9{9,})', and it seems to me something like that should work, except that '^IT9{9,}' doesn't even match, nor does '^IT[9]{9,}'.  The only thing that does match is '^IT\d{9,}', but that is not what you need.  If no one else offers a solution here, I hope you will open an incident with the Teradata supprot center.

Enthusiast

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Thank you, GJ, for at least confirming that I'm not going crazy.

Teradata Employee

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Hi,

 

Seems it's an issue with regexp_similar, works as intended with regexp_substr :

 

REGEXP_SUBSTR('IT99999999990','^IT(\d)\1{9,}',1,1,'i')

 

(that you can use as workaround with the is (not) null predicate)

 

Seems there's no incident opened about this issue. If possible, please open one.

Enthusiast

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Hi Guys,

 

I'd be happy to open an Incident to report this issue but I seem to be unable to do so. While I am signed in on the Teradata Access Portal so that I have the option to "Create an Incident", when I select it I get popped back to the same homepage again and the URL includes "userNotFoundException%3Dtrue". Selecting the drop-down menu for my user in the top-right of the screen just gives me the option to "Sign out", so this is making me think there is an issue with my user account that I created yesterday and which seems to work fine here.

Teradata Employee

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

You need a Tays account to open an incident, it's a special access within access.teradata.com

 

Either your customer has a tays access with its account and he can create the incident

or From https://access.teradata.com ,use the “create account” option at top-right. Select the “need TaYS” option, you'll need to enter your customer number.(ask your customer to get it)

 

The label "create account" is a bit misleading, in fact you can reenter your existing account and ask special Tays access.

Teradata Employee

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Not a bug.

REGEXP_SIMILAR tests if the entire string is "similar", i.e. the pattern is implicitly treated as if it starts with ^ and ends with $.

Your test data has a trailing 0 so does not match. Using something like 'IT(\d)\1{9,}.*' as the pattern would work.

Enthusiast

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Hi Fred,

 

Thank you for the update and clarification. I have tried updating the statement as recommended but it seems that Teradata is still not matching the pattern:

 

SELECT REGEXP_SIMILAR('IT99999999990','^IT(\d)\1{9,}.*','i');

 

Do you get the same result, or is this matching for you? If I remove the trailing zero from the string to check and use the original statement, which should match exactly, the result is still 0 (No Match).

Teradata Employee

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Well, I concur with mpeaston, when I first replied, I did some testing with way simpler patterns (fully matching the source string) and I still confirm backreferences don't work with regexp_similar...

Teradata Employee

Re: Teradata 15.10 REGEXP_SIMILAR Expression Not Matching As Expected

Yes, the corrected pattern works (returns 1) for me on TD15.10.0.7 (with or without the redundant ^ at the start).

 

Is it possible you have a SYSLIB.REGEXP_SIMILAR user-defined function which behaves differently from the Teradata-supplied (and supported) TD_SYSFNLIB.REGEXP_SIMILAR? The session default database and SYSLIB are searched before TD_SYSFNLIB.

 

You can also use REGEXP_INSTR to test if a string "contains" a pattern versus "matches" the pattern.