First, I would like to say thanks for all the help I've got from this forum, even without writing and asking specific questions. :)
But now it seems I have to ask a question, because I could not cope with the problem that I have.
Here it is:
I have a unicode varchar column in a table, which has some characters in it which I don't need and I have to remove them.
What I need to extract from this column are only the numerics (1, 2, 3, 4, 5, 6, 7, 8, 9, 0).
The removal of these characters should be in the SELECT statement.
So far, I've tried with CASE statemetns, with the use of SUBSTR, POSITION, CHARACTAR_LENGHT and so on functions but to no avail. Mainly, because ot the diversity of the characters in the field which resulted in a huge CASE statement. Even that was not enough, there were more characters that I could not fetch with the CASE statement.
What I know I must do is, to write a LOOP of somekind which will cleanse the column. But unfortunately, my knowlegde is not that comprehensive yet. :(
So, I am asking for help here.
Here are exapmles of the problematic data in the column, which I know of so far:
1,30823E+12 - has to be 130823
and finally, there are some UNICODE characters ("Ю" and "Е"which have to stay unchanged:
Where is the E?
This is a UDF task - check if one of
can help - thanks to ebay again...
The "1,30823E+12", has to be "130823", because the true data in the field has been lost and what needs to be done is just to remove the "E+12" and the "," from the string. That is the goal, I don't know why is like this. :)
The unicode character "Е" is always at the end of the string - 030099458Е
And finaly, I came across these UDF's, but there is an issue of their implementation, due to company policies. There's nothing that can be done about it. :(
Thanks for the answer!
To be honest - you will not find an elegant solution for this using plain SQL.
Either doing the cleansing outside with an ETL tool or using and specific UDF would be good but it seems it is out of your range.
So the big Case route might be your way
If it would be only the numberic requirement it could be easily generated a case where you check for each position if it is numeric but the E+12 and E and and and rules will make it more difficult...
I was afraid that someone would say that. :)
Anyway, I'll continue tinkering with the issue and hope, that I'll find decent solution, if not perfect.
If so, I'll post it on the forum, just in case. Maybe someone else will stumble upon a similar problem.
Thank you again for your time and answers Ulrich.
Currently we have a function definition in DEV which we have to migrate into PROD but we don’t have BTEQ Script (UDF) to execute as this was written someone and left from organization. Would you please help me to run below mentioned CLEANSE FUNCTION on PROD. Thank you
REPLACE FUNCTION SYSLIB.CLEANSE
(P1 VARCHAR(500) CHARACTER SET LATIN,
RETURNS VARCHAR(500) CHARACTER SET LATIN
PARAMETER STYLE SQL
CALLED ON NULL INPUT
EXTERNAL NAME 'CS!cleanse2!*****************(local system path)\cleanse2.c!F!cleanse2'
You will need to distribute the executable file to all the nodes on the target system and change the EXTERNAL NAME string to indicate where the object resides.
Download the UDF packaging tool from Developer Exchange, and see the topic on "Distributing Packages" in the SQL External Routine Programming manual.
Sorry for taking so long, but I just found this thread. The CLEANSE function referred to above appears to be mine. If you still need the source code, write to me directly at grommel [at] cpan [dot] org. It works only on LATIN strings but can easily be modified to work on Unicode as well. Thanks.