TPT Extract - Newline character in data causes number of records mismatch

Tools
Enthusiast

TPT Extract - Newline character in data causes number of records mismatch

I have written a script which will generate tpt export script based on the parameters passed.

Since I am generating this script it has to be generic in nature it works for all kinds of table and data type.

I am successful in doing this. In order to remove the new line character  I am using oreplace function in my select query of tpt for all the varchar columns which is making this query very costly.

Is there a way I can handle newline character without this oreplace function.

Please let me know if we have any other work for this.

USING CHARACTER SET ASCII

DEFINE JOB ec_rgtry_rlatnp_dat_extract

DESCRIPTION 'export ec_rgtry_rlatnp_dat_Job'

(

  DEFINE SCHEMA SCHEMA_ec_rgtry_rlatnp_dat_extract (

RGTRY_RLATNP_ID VARCHAR(40),

RGTRY_RLATNP_ASOCN_TYPE_CD VARCHAR(4),

SOR_ID VARCHAR(10),

ENT_CUST_ID VARCHAR(38)

);

DEFINE OPERATOR EXPORT_OPERATOR

TYPE EXPORT

SCHEMA SCHEMA_ec_rgtry_rlatnp_dat_extract

ATTRIBUTES (

UserName='XXXX',

UserPassword='YYYYY',

TdpId='DB',

MaxSessions=1,

MinSessions=1,

SpoolMode='NoSpool',

VARCHAR DateForm = 'ANSIDATE',

SelectStmt = 'SELECT

 trim((OREPLACE(RGTRY_RLATNP_ID,x''0A'','''')) (VARCHAR(40))),

trim((OREPLACE(RGTRY_RLATNP_ASOCN_TYPE_CD,x''0A'','''')) (VARCHAR(4))),

TRIM(SOR_ID),

trim(((ENT_CUST_ID) (BIGINT) (VARCHAR(20))))

FROM DB.EC_RGTRY_RLATNP_PT

;',

VARCHAR ReportModeOn

);

DEFINE OPERATOR FILE_WRITER_ec_rgtry_rlatnp_dat_extract

TYPE DATACONNECTOR CONSUMER

SCHEMA SCHEMA_ec_rgtry_rlatnp_dat_extract

ATTRIBUTES (

FileName='20140505104506_abc.dat',

Format='DELIMITED',

TextDelimiter='|',

IndicatorMode='N',

OpenMode='Write'

);

APPLY TO OPERATOR (FILE_WRITER_ec_rgtry_rlatnp_dat_extract[1])

SELECT

RGTRY_RLATNP_ID ,

RGTRY_RLATNP_ASOCN_TYPE_CD ,

SOR_ID ,

ENT_CUST_ID

FROM OPERATOR (EXPORT_OPERATOR[1]);

);

3 REPLIES
Enthusiast

Re: TPT Extract - Newline character in data causes number of records mismatch

TPT Version Used:

Teradata Parallel Transporter DataConnector Version 13.00.00.12

Teradata Parallel Transporter Export Operator Version 13.00.00.07

Enthusiast

Re: TPT Extract - Newline character in data causes number of records mismatch

Can someone please respond to this if you know a resolution for this problem?

This will very helpful for me if we can solve ouside of query.

Enthusiast

Re: TPT Extract - Newline character in data causes number of records mismatch

I don't know if this is of any help, but I have done something similar on unix using awk to format data copied from excel and pasted into a text file.  If you wrote the file to unix and used an OS command then maybe this would be more efficient.


If you have an excel file that you need in a flat tab delimited file  you can just select the data, copy and then paste it into a text file in ultraedit (or some other editor).

Sometimes though people will have put return characters in some of the cells in excel which can mess things up, as it splits rows over multiple lines.

This piece of awk will fix this by counting the number of fields in the first row, then checking subsequent rows to see if they have enough records.  If they don’t then it concatenates the row with the next one, and checks the number of fields again in case the next one needs to be concatenated as well.  The return characters end up being replaced by spaces.

To run this create a file ‘fixfile.awk’ containing the code below and then run the command:

nawk -f fixfile.awk myfile_tmp.txt > myfile.txt

Disclaimer - This code is supplied with no guarantees that it will work on your files!

BEGIN{

FS="\t";

recordcount=0;

continueline=0;

currentfields=0;

}

{

        if ( NR == 1 )

        {

                recordcount=NF;

printf("%s\n",$0);

        }

        else

        {

                if ( continueline == 0 && NF == recordcount     )

                {

                        printf("%s\n",$0);

                        currentfields=0;

                        continueline=0;

                }

                else if ( continueline == 0 && NF < recordcount )

                {

                        printf("%s ",$0);

                        currentfields+=NF;

                        continueline=1;

                }

                else if( continueline == 1 )

                {

                        printf("%s ",$0);

                        currentfields+=NF;

                        currentfields--;

                        if( currentfields < recordcount)

                        {

                                continueline=1;

                        }

                        else

                        {

                                continueline=0;

                                currentfields=0;

                                printf("\n");

                        }

                }

        }

}