Different text formating to decimal

Database
Enthusiast

Different text formating to decimal

We recently started with a teradata database, before TD we worked mainly with MS-sqlserver and Oracle databases. Now I have a small challenge, so it looked like

We receive often small or medium sized datasets to upload for cases we have to resolve one of our databases. The data format is never the same, the challenge lays in format string for a US-English and European data (number) formats.

US/English are delivering number string usually; 987,654,321.01

European clients differ in their stile; 987.654.321,01

and also the lazy ones ; 987654321,01 / 987654321.01

with or without leading zero or spaces. You name it we recieve it!

A slight difference, slowly a head deck for me. US/ENG use the comma as thousand separator and the dot as decimal separator. In Europa it usually reversed.

I’m trying to load my example data by BTEQ (I know SQL assistant proved also an import option, but some datasets are just a little to big)

I’m using BTEQ  in this case from desktop windows pc. I know larger load can be done by mload, that is my next challenge and not to speak of repeating everything on the Unix platform J

Data string_example.txt

------------------------------------------------------------------------------------------

REC_ID|OPS_US1|OPS_US2|OPS_EU1|OPS_EU2|OPS|AREACODE

1|2,339.23|2339.23|2.339,23|2339,23|two

  or

REC_ID|OPS_US1|OPS_US2|OPS_EU1|OPS_EU2|OPS|AREACODE

2|32,734.952|32734.952|32.734,952|32734,952|three

  or

REC_ID|OPS_US1|OPS_US2|OPS_EU1|OPS_EU2|OPS|AREACODE

3|1.1|1.1|1,1|1,1|one

  or

REC_ID|OPS_US1|OPS_US2|OPS_EU1|OPS_EU2|OPS|AREACODE

4|1.814|1814|1,814|1814|zero

  or

REC_ID|OPS_US1|OPS_US2|OPS_EU1|OPS_EU2|OPS|AREACODE

5|001.814|001814|001,814|001814|zero_zero

  or

REC_ID|OPS_US1|OPS_US2|OPS_EU1|OPS_EU2|OPS|AREACODE

6|  1.814|  1814|  1,814|  1814|space_zero

To demonstrate how data arrives, all kind of examples above. Usually only one of the examples applies :-)

Created a table:

------------------------------------------------------------------------------------------

CREATE SET TABLE SANDBOX.CASEDATA,

     NO FALLBACK,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

   ( REC_ID   INTEGER not null,

     OPS_US1  DECIMAL(10,3),

     OPS_US2  DECIMAL(10,3),

     OPS_EU1  DECIMAL(10,3),

     OPS_EU2  DECIMAL(10,3),

        AREACODE INTEGER VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC,

CONSTRAINT pk_CASEDATA PRIMARY KEY ( REC_ID ) );

Created a BTEQ script (that currently is not working), can’t find right formatting :-(

------------------------------------------------------------------------------------------

/* leave first line empty to prevent log-date entry*/

.LOGON xx.xx.xx.xx/user,pwd;

DATABASE SANDBOX;

.SET TITLEDASHES OFF;

.IMPORT VARTEXT '|' FILE='\serv.local\ifs\ string_example.txt' skip=1;

.QUIET ON;

.REPEAT * PACK 10

USING REC_ID          (VARCHAR(25)),

      OPS_US1         (VARCHAR(25)),

      OPS_US2         (VARCHAR(25)),

      OPS_EU1         (VARCHAR(25)),

      OPS_EU2         (VARCHAR(25)),

      AREACODE        (VARCHAR(25))

INSERT INTO SANDBOX.CASEDATA values (

        :REC_ID,

        cast(:OPS_US1 as decimal(10,3) format 'xxxxxxxxxxxxxxxxxxxxx'),

        cast(:OPS_US2 as decimal(10,3) format 'xxxxxxxxxxxxxxxxxxxxx'),

        cast(:OPS_EU1 as decimal(10,3) format 'xxxxxxxxxxxxxxxxxxxxx'),

        cast(:OPS_EU2 as decimal(10,3) format 'xxxxxxxxxxxxxxxxxxxxx'),

        :AREACODE );

.QUIET OFF;

.LOGOFF;

.EXIT;

I found out the hard way that with USING, I always use CHAR or VARCHAR. But currently I’m can’t find the right format string, search the forum and internet,  tried RTFM, but until now not successful..

Whom can help (will) me?

THX!

2 REPLIES
Junior Contributor

Re: Different text formating to decimal

Leading/trailing blanks/zeroes and thousand separators are never a problem, but the wrong (i.e. european) separators.

What's your TD release?

When you have the oTRANSLATE function installed it's easy to switch european to US before the cast:

CAST(oTRANSLATE(:OPS_EU1, ',.', '.,') AS DECIMAL(10,3)) 

Or if you're shure the fractional part has always the correct number of digits you might simply remove all blanks and separators:

CAST(oTRANSLATE (:OPS_EU1, ' ,.', '') AS DEC(13,3)) / 1000

Dieter

Enthusiast

Re: Different text formating to decimal

We are currently still using V13.10.05.03 so the oTRANSLATE function is not present yet. The fractional part * is often different number of digits.

For now I have to find a solution in string manipulation or make a UDF and there's always the manipulation in the source before uploading.

* Keep in mind that these kind of datasets are in adhoc fase of the cases, and usually provided by non IT staff (business) we get, demand, a more structured delivery when a case needs to go to a production fase.

thanks for your attention.