Does anyone have an example of a TPT script with format = text?
I'm trying to load a file with variable length records into a table with a single varchar(1000) column. I tried basing the script on the qstart1 example and just changing 'delimited' to 'text', but I get an error:
,DDLPrivateLogName = 'ddlprivate.log'
,LoadPrivateLogName = 'loadprivate.log'
,TargetErrorList = ['3807']
,WorkingDatabase = 'dev_sandpit'
,TargetWorkingDatabase = 'dev_sandpit'
,TargetTable = 'RAW_CODA_LOAD'
,LogTable = 'RAW_CODA_LOAD_LOG'
,ErrorTable1 = 'RAW_CODA_LOAD_E1'
,ErrorTable2 = 'RAW_CODA_LOAD_E2'
,SourceFileName = 'coda_mil2.692130.txt'
,SourceFormat = 'text'
,OpenMode = 'read'
,DropLogTable = 'Yes'
The code is:
DEFINE JOB qstartcoda2
APPLY $INSERT TO OPERATOR ($LOAD)
SELECT * FROM OPERATOR($FILE_READER);
DEFINE JOB qstartcoda1
DESCRIPTION 'LOAD CODA TABLE'
DEFINE SCHEMA CODA_SCHEMA
DESCRIPTION 'CODA INFORMATION'
DEFINE OPERATOR LOAD_OPERATOR
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER LOAD OPERATOR'
VARCHAR TdpId = @TargetTdpId,
VARCHAR UserName = @TargetUserName,
VARCHAR UserPassword = @TargetUserPassword,
VARCHAR TargetTable = @LoadTargetTable,
VARCHAR LogTable = @LoadLogTable,
VARCHAR ErrorTable1 = @LoadErrorTable1,
VARCHAR ErrorTable2 = @LoadErrorTable2
DEFINE OPERATOR DATACONN
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER DATACONNECTOR OPERATOR'
TYPE DATACONNECTOR PRODUCER
VARCHAR FileName = @SourceFileName,
VARCHAR Format = @SourceFormat
APPLY ('INSERT INTO '|| @LoadTargetTable ||' (:CODADATA);')
TO OPERATOR (LOAD_OPERATOR )
SELECT * FROM OPERATOR (DATACONN);
'Text' = character data separated by an end-of-record (EOR) marker. The
EOR marker can be either a single-byte linefeed (X'0A') or a double-byte
carriage-return/line-feed pair (X'0D0A'), as defined by the first EOR
marker encountered for the first record.
Sorry if this is a basic query, but I'm new to TPT and can't find this information or any examples in the user guide or reference manuals.
Use "delimited" (even though there is only one column, so the delimiter does not actually appear in the input records).
The "text" format is only used when all input fields have fixed lengths.
I did end up using delimited with a delimiter string of 'jkajdnsksklfnsklslkdmjsks'. I'm hoping that never occurs in the file!
It doesn't seem like the right way to do it though!
If each record has a single column, then you do not need a delimiter.
Curious: why did you pick such a large delimiter?
I don't want to choose a delimiter! However if I choose an empty delimiter by using:
SourceTextDelimiter = ''
Then TPT uses the default delimiter of '|' which does exist in the file.
If I leave the SourceTextDelimiter parameter out then, again TPT uses the default demiliter.
The reason I chose such a long delimiter is because I need a delimiter that will never exist in the file. If you can suggect a more sensible way that I can treat the file as a single column of variable length text then that would be very useful. The way that I'm doing it doesn't seem to be the best way of achieving this.
I am not sure whether a shorter or longer delimiter will affect performance, but I thought if someone did not know what delimiter they should pick, they would choose something with punctuation marks, like "??", "???, "?!?".
Nothing wrong with the way you are doing this, I was just curious.
What version of TPT are you using?
Can you put the single column data in quotes?
If so, then the processing will ignore anything in the quotes that might be considered the delimiter.
I'm using version 15.0 (I've only recently downloaded it)
I'm not sure if I can put quotes round it. There could be quotes within the file, and they may be unmatched ones - it's a bit of a strange data format we don't really have any control over.
I also don't really want to have to pre-process the file on the operating system before I load it. We will be loading files which are used by an existing system from directory every 2 minutes.
What I really want is to always treat each line as a single column. Is it possible to set the field delimiter to be the unix end of line character, i.e. the same character as the record delimiter?
Will your data have tab characters?
You can specify the delimiter as:
VARCHAR TextDelimiter = 'TAB'
(Yes, I know; strange)
If not, we have an undocumented features (undocumented because it is going to change in the near future) where you can specify the delimtier characters in hex format. This uses a different attribute:
VARCHAR TextDelimiterHex = '0a0d'
But like I said, we will be changing the feature in the near future, so I do not want you to get tied into this. But you can try it to see if it works.
Currently we have option to specify the column delimiter in TPT script but "\n" as default row delimiter. There are few records with column data with "\n".
I want to verify whether we have option to change the row delimiter value and make it parameter. i.e. change to something else.
Current setup -
VARCHAR Format = 'DELIMITED',
VARCHAR TextDelimiter = @MyDelimiter,
Appreciate your response.