How to calculate a character occurance in a string in teradata

Database
Enthusiast

How to calculate a character occurance in a string in teradata

Hi ,

I have a column having values

1       1/2/3/4/5/6/7/8/9/10

2       1/2/3/4/5

3       1/2/3/4/5/6

4       1/2/3/4/5/6/7/8

I have to search for the character '/' and if count of the character is equal to 6 or more than 6,

then delete from 6th position of the record..

Result :

1       1/2/3/4/5/6

2       1/2/3/4/5

3       1/2/3/4/5/6

4       1/2/3/4/5/6

How can I achieve this ?

Thanks,

Jitu




7 REPLIES
Senior Supporter

Re: How to calculate a character occurance in a string in teradata

which version are you on?

Enthusiast

Re: How to calculate a character occurance in a string in teradata

Hi 

If you are using TD 14 or higher then you can use below code snippet as reference, where column C1 will hold the intended value.

WITH RECURSIVE SPLIT_DATA(MAIN , SUB , COUNTS )

AS

(

SEL C1 AS MAIN, C1 AS SUB, 0 AS COUNTS

FROM TABLES

UNION ALL

SEL DIRECT.MAIN AS MAIN, SUBSTR(DIRECT.SUB,INDEX(DIRECT.SUB, '/')+1, LENGTH(DIRECT.SUB) - INDEX(DIRECT.SUB, '/')  ) AS SUB, DIRECT.COUNTS+1 AS COUNTS

FROM SPLIT_DATA DIRECT

WHERE INDEX(DIRECT.SUB, '/') <> 0

)

SELECT CASE WHEN COUNTS < 6 THEN MAIN ELSE SUBSTR(MAIN,1,11) END AS MAIN FROM SPLIT_DATA

QUALIFY ROW_NUMBER() OVER(PARTITION BY MAIN ORDER BY COUNTS DESC) = 1

;

Thanks

Santanu

Enthusiast

Re: How to calculate a character occurance in a string in teradata

If you have strtok function then you can think of  doing something like this:

select case when strtok('12/45/23/67/90/32/41/56/435/10','/',6) is not null then

strtok('12/45/23/67/90/32/41/56/435/10','/',1)||'/'|| strtok('12/45/23/67/90/32/41/56/435/10','/',2)||'/'||

strtok('12/45/23/67/90/32/41/56/435/10','/',3)||'/'||strtok('12/45/23/67/90/32/41/56/435/10','/',4).................................................... end

........

You can think of using char_length, substr function too depending on your data

Enthusiast

Re: How to calculate a character occurance in a string in teradata

Hi,

I have used this query as below.

sel col1,col2,Case when cast(CHAR_LENGTH(col2) - CHAR_LENGTH(OTRANSLATE(col2, '/',''))AS INTEGER) >= 6 

then SUBSTR(col2, 0, CAST(INSTR( col2,'/',1,6) AS INTEGER)) 

   Else col2end  from table

It was working fine.. But now my delimiter changed from '/' to '~!'.

Can you help me on this ?

Thanks,

Jitu

Re: How to calculate a character occurance in a string in teradata

It's interesting to use a Regular Expression to solve the problem.  My query does more than the original question, and of course it could be eaisly simplified.

The query below replaces a set of potential delimeters with a pipe.  It also removes consecutive delimeters with nothing between.  If an ending delimeter exists, it is removed.  Next it removes all characters that are not a delimeter.  We can then then count the length of the resulting string.

I wonder if the compiler/optimizer compiles all three into one expression, before executing?  And, I wonder if all three regex could be written into just one statement?  Also, I haven't found the correct syntax for escaping a single quote, so the single quote could be an accepted delimeter.

SELECT

    REGEXP_REPLACE( /* Removes all but pipe */

        REGEXP_REPLACE( /* Removes ending delimeters. */

                        REGEXP_REPLACE    ( /* Converts all delimeters to pipes, and removes duplicate consectuive delimeters.  

                                                Need to remove terminal delimeter. */

                                                '1/2/3/4/5/6/7/8/9/10'

                                                ,'[ /%@.,\-\/\x27#()]+'

                                                ,'|'

                                                ,1

                                                ,0

                                                ,'i'

                                            )

                        , '[|]$'

                        , ''

                        ,1

                        ,0

                        ,'i'

                    )

            , '[^|]'

            , ''

            ,1

            ,0

            ,'i'

        ) AS Pipes_Only

        , LENGTH( Pipes_Only )+1 AS Number_of_Words

    ;

Enthusiast

Re: How to calculate a character occurance in a string in teradata

sel char_length (trim(val)),
 otranslate( val,'/!@#*$%^&',''),
 case when
 char_length (trim(val))-char_length
 (trim (otranslate (val ,'/!@#*$%^&','')))>=6 then  substr( val , 0, instr (otranslate (val ,'/!@#*$%^&','/'),'/',1,6))
 else val end as res
 from  regexp

hi Depook,

we can use multiple symbol latters in Otranslate function for the search in a Column  like REGEXP_replcae function

 please fine above sql. hope it will help you.

Regards

Arparmar

Enthusiast

Re: How to calculate a character occurance in a string in teradata

Hi Deepok/Jitu,

I have replicated your scenario and tried out a solution and am able to get the desired output.

1. I have created a table (tmp_substr) with 2 columns (Str1, Str2) Str1 holds the values of your first scenario (with / as delimiter) and Str2 holds value for your second scenario (/ replaced with ~!)

Str1                                    Str2

1/2/3/4/5                          1~!2~!3~!4~!5

1/2/3/4/5/6                       1~!2~!3~!4~!5~!6

1/2/3/4/5/6/7/8                 1~!2~!3~!4~!5~!6~!7~!8

1/2/3/4/5/6/7/8/9/10         1~!2~!3~!4~!5~!6~!7~!8~!9~!10

2. First Scenario (Col Str1) - (Handling only /)

I used regular expressions to solve this, first i find the 6th occurence of / using regexp_instr , if there are less than 6 occurence the whole string is displayed, if there are more than or equal to six occurence then the position until the 6th occurence is substringed.

Query for First Scenario :  (Col Str1) - (Handling only /)

select substr(str1,0, case when regexp_instr(str1,'/',1,6,1,'c') =0 then  length(str1)+1   else regexp_instr(str1,'/',1,6,1,'c')-1 end) as Output from tmp_substr ; 

Output : 

Str1                           Output

1/2/3/4/5                     1/2/3/4/5

1/2/3/4/5/6                  1/2/3/4/5/6

1/2/3/4/5/6/7/8            1/2/3/4/5/6

1/2/3/4/5/6/7/8/9/10    1/2/3/4/5/6

3. Second Scenario :  (Col Str2) - (Replacing ~! to / then  Handling /)

Here we replace ~! to / using regexp_replace and then compute the same function as in step 2.

Query for Second Scenario :  (Col Str2) - (Replacing ~! to / then  Handling /)

select str2,substr(regexp_replace(str2,'(["~!"]{2})','/',1,0,'i'),0,case when regexp_instr(regexp_replace(str2,'(["~!"]{2})','/',1,0,'i'),'/',1,6,1,'c') =0 then  length(regexp_replace(str2,'(["~!"]{2})','/',1,0,'i'))+1 else  regexp_instr(regexp_replace(str2,'(["~!"]{2})','/',1,0,'i'),'/',1,6,1,'c')-1 end) from tmp_substr ;

Output : 

Str2                                                        Output

1~!2~!3~!4~!5                                      1/2/3/4/5

1~!2~!3~!4~!5~!6                                1/2/3/4/5/6

1~!2~!3~!4~!5~!6~!7~!8                    1/2/3/4/5/6

1~!2~!3~!4~!5~!6~!7~!8~!9~!10       1/2/3/4/5/6

You can use the query 2 for handling your scenario (replace str2 with your column name and tmp_substr with your tablename), please check and tell if this worked for you.

Thanks & Regards,

Srivignesh KN