how to make TPT script ignore bad record

Tools
Enthusiast

how to make TPT script ignore bad record

I have a script use tbuild to load the flat files to TD.

It works fine when all fields of record are satisfying the column length.

 

However, for certain records, i received following error and the script aborted:

 

*** < DATA ERROR **************************************
!ERROR! Fatal data error processing file 'removed_filename'. Delimited Data Parsing error: Column length overflow(s) in row 4325.
Closing file: 'Data File'
Detaching default Access Module
Files read by this instance: 0

 

I know the error is because one fileds is longer then the table column in TD.

I want to skip this row and continue the loading.

 

I tried FileReaderErrorLimit=100 and LoadErrorLimit=100 but it doesn't seem work.

So my question is:

 

Is there a parameter can let the TPT continue by ignoring the bad record?

 


Accepted Solutions
Teradata Employee

Re: how to make TPT script ignore bad record

A data parsing error is not something that we can guarantee we can recover from.

Therefore, this is a terminating error.

Otherwise, while parsing what we believe to be an invliad character, we might end up parsing across field or record boundaries, and end up loading incorrect data.

 

Data integrity is more important to us over trying to recover from a parsing error.

 

 

-- SteveF
1 ACCEPTED SOLUTION
11 REPLIES
Senior Apprentice

Re: how to make TPT script ignore bad record

Have a look at the TruncateColumnData option.
Enthusiast

Re: how to make TPT script ignore bad record

Hi dnoeth,

 

Actually I am searching on google and found same TruncateColumnData.

I have tried to set TruncateColumnData="'YesWithoutLog'"

or TruncateColumnData="'Y'"

But seems TPT is not recognizing this parameter.

The log shows 

 

32-bit DataConnector operator for SunOS release 5.10 ***
TDICU................................... 14.10.00.04
PXICU................................... 14.10.00.10
PMPROCS................................. 14.10.00.11
PMRWFMT................................. 14.10.00.03
PMTRCE.................................. 13.00.00.02
PMMM.................................... 14.10.00.02
DCUDDI.................................. 14.10.00.59
PMHEXDMP................................ 14.10.00.01
PMUNXDSK................................ 14.10.00.13

QuotedData: 'NO (defaulted)' (=0)
AcceptEOFasEOR: 'YES (defaulted)' (=1)
AppendDelimiter: 'NO (defaulted)' (=0)
TextDelimiter = '^A' [1 byte(s) x'01']
WARNING! EscapeTextDelimiter attribute not specified, there is no default
WARNING! OpenQuoteMark attribute not specified, there is no default
WARNING! CloseQuoteMark attribute not specified, there is no default
WARNING! EscapeQuoteDelimiter attribute not specified, there is no default
WARNING! TrimChar attribute not specified, there is no default
UseGeneralUDDIcase: 'NO (defaulted)' (=0)
UDDI_ParsingCase=1100000
WriteBOM: 'NO (defaulted)' (=0)
AcceptExcessColumns: 'NO (defaulted)' (=0)
AcceptMissingColumns: 'NO (defaulted)' (=0)
TruncateColumnData: 'NO (defaulted)' (=0)
TruncateColumns: 'NO (defaulted)' (=0)
TruncateLongCols: 'NO (defaulted)' (=0)
WARNING! Definition of attribute ErrorLimit ignored
WARNING! RecordErrorFilePrefix attribute not specified, there is no default
RecordErrorVerbosity: OFF (default) (=0)
FileName: '/export/home/pp_adm/inbound/stc/2016090910/filelist'
OpenMode: 'READ' (1)
Format: 'DELIMITED' (1)
IOBufferSize: 131072

 

Not only this parameter is not recognized, several others are not respected too. such as RowErrFileName, RecordErrorFileName (trying just incase version of TPT), ErrorList.

 

This is my parameter file:

 

NoOfLoadInstances=6
NoOfReadInstances=3
SourceTextDelimiter="'^A'"

SourceFileList="'Y'"
TargetWorkingDatabase="'working_table'"
TargetTable="'target_table'"
SourceFileName="'somwhere/2016090910/filelist'"
SourceFormat="'DELIMITED'"
OpenMode="'Read'"
FileReaderSkipRows=0
AcceptExcessColumns="'Y'"
AcceptMissingColumns="'Y'"
LoadErrorLimit=1000
FileReaderErrorLimit=1000
TargetErrorLimit=10
ErrorList = "'19134'"
RecordErrorFileName="'/tmp/test1'"
RowErrFileName="'/tmp/test1'"
TruncateColumnData="'YesWithoutLog'"
TruncateColumns="'Y'"
TruncateLongCols="'Y'"

 

Senior Apprentice

Re: how to make TPT script ignore bad record

Ops, don't know, you better wait for Steve Feinholz :-)
Enthusiast

Re: how to make TPT script ignore bad record

I would guess that this is because the length of the field in the file is longer than you have described in the schema definition in the TPT script. For example, you may have specified the field as VARCHAR(100) and on the file, it exceeds 100 characters. I believe that TPT will stop at that point when it encounters that type of error in the file.

 

This is different from having the size of the field be larger than the column that you're trying to insert into. In that case, TPT can write it to the error table and continue on.

Enthusiast

Re: how to make TPT script ignore bad record

You guess is true but solution is not right.

TPT provides some parameters to continue the loading without abort.

The issue here is these parameters are not working due to some reasons. Need to dig out the reasons,

Teradata Employee

Re: how to make TPT script ignore bad record

Job variable files must have the names of the job variables, not the names of the attributes.

You did not post the script; that might help in the diagnosis as well, but I think the issue is the content of the job variable file.

 

-- SteveF
Enthusiast

Re: how to make TPT script ignore bad record

 

The complete script is here:

tbuild -C -h 128M -f "$template_tpt" -v "$parameter_files"

 

 

$parameter_files have been posted in above post.

In above post, some of the attributes are actually working, such as any attributes above FileReaderSkipRows (included).

$template_tpt is only a template, delete the table and insert the data, it doesn't define the varliables.

 

 

Enthusiast

Re: how to make TPT script ignore bad record

Hi feinholz,

 

Here is the script:

 

tbuild -C -h 128M -f tpt_load_template.tpt \
-v parameter_file  -j job_name

 

 

And this is the content of tpt_load_template.tpt. 

This template didn't specify any variables.

I have to set the attributes in the parameter_file.

Some of them are work while the others are not.

 

DEFINE JOB LOAD_FILE_DATA
DESCRIPTION 'LOAD_DATA_FROM_FILE_TO_TBL'
(
STEP tgt_del
(
APPLY 'DELETE FROM '||@TargetWorkingDatabase||'.'||@TargetTable||' ALL;' TO OPERATOR ($DDL);
);
STEP load_data
(
APPLY $INSERT TO OPERATOR ( $LOAD[@NoOfLoadInstances] )
SELECT * FROM OPERATOR ( $FILE_READER[@NoOfReadInstances] );
);
);

 

 

Teradata Employee

Re: how to make TPT script ignore bad record

A couple of things.

 

First, do not use -C on the command line when loading to Teradata. That will cause buffers of rows to be distributed in a round-robin fashion to the Load operator instances and will actually hurt performance.

 

Next, as I indicated before, the job variable file contains operator attribute names for some of the variables, and not the job variable names. The job variable file must contain the names of the job variables on the left hand side of the "=" sign.

 

Those variables in the job variable file that begin with "Source" or "Target" or the name of the operator itself, like "Load" and "FileReader" will be processed correctly. The rest will be ignored.

 

For example, these will not be processed:

 

OpenMode="'Read'"
AcceptExcessColumns="'Y'"
AcceptMissingColumns="'Y'"
ErrorList = "'19134'"
RecordErrorFileName="'/tmp/test1'"
RowErrFileName="'/tmp/test1'"
TruncateColumnData="'YesWithoutLog'"
TruncateColumns="'Y'"
TruncateLongCols="'Y'"

 

If you are not sure what the proper names are, please look at the template files themselves in the "templates" directory where TPT is installed.

 

Lastly, I see that you put "19134" for a value of "ErrorList". The ErrorList feature is only for DBS errors, not TPT errors.

 

-- SteveF