Need Help on using TPT with UNICODE characters

Tools & Utilities
Enthusiast

Need Help on using TPT with UNICODE characters

Hi, I want to use TPT in one of my project and I am facing the following issue,

(I have read http://developer.teradata.com/tools/articles/teradata-parallel-transporter-unicode-usage#comment-175... and doing things mentioned in that article but something is missing, here are the details)

table DDL

--------

Id DECIMAL(18,0) TITLE 'Identifier' NOT NULL,

Vendor_Id VARCHAR(10) CHARACTER SET LATIN CASESPECIFIC TITLE 'Vendor Identifier' NOT NULL,

Name VARCHAR(4000) CHARACTER SET UNICODE NOT CASESPECIFIC TITLE 'Name',

Content_Name VARCHAR(600) CHARACTER SET UNICODE NOT CASESPECIFIC FORMAT 'X(300)' TITLE 'Content Name',

In TPT DEFINE SCHEMA all are casted to appropriate VARCHAR(n)

In EXPORT_OPERATOR SQL all four fields are casted to VARCHAR(n)

I am using FILE_WRITER operator to write data into delimited file.

Scenario 1: - When I run script with following command, the script runs fine but in the output file it does not show up the Chinese unicode characters. ( it does not have USING CHAR SET UTF8 DEFINE JOB MOVE_DATA_TO_FLAT_FILE)

tbuild -f

Scenario 2:- When I put USING CHAR SET UTF8 DEFINE JOB MOVE_DATA_TO_FLAT_FILE, and run thru tbuild -f it gives me an error EXPORT_OPERATOR: TPT12108: Output Schema does not match data from SELECT statement

My script is written in ASCII character set (all English) and I am using AIX machine to write a script. I do not have windows script generator tool.

How do I configure my script so that it gives me UTF8 characters in the file?

Tags (2)
9 REPLIES

Re: Need Help on using TPT with UNICODE characters

Hi,

Can any body tell me about TPT stage. Is it a seperate stage that we can view in datastage or Is it a unix script ? I would like to know about the TPT stage for my project usage purpose.

Teradata Employee

Re: Need Help on using TPT with UNICODE characters

There are 2 things to consider:

1. you cannot write data in delimited format by using the Export operator; you must use the Selector operator and you must set the ReportMode attribute to the correct value

2. you must make sure that the size of the VARCHAR columns is in terms of "bytes", not "characters".

For example, a CHAR(10) CHARACTER SET UNICODE will result in 30 bytes required for the column if the client session character set is set to UTF8.

Thus, in the schema object, you would use CHAR(30) for that column.

The DBS defines sizes in terms of characters.

The client products work in terms of bytes.

-- SteveF
Enthusiast

Re: Need Help on using TPT with UNICODE characters

Thanks, the second point you mentioned worked for me. But I still do not understand why EXPORT will not work for delimited format. I am running my script and it works fine. Is there any specific reason why EXPORT should not be used?

I read that SELECTOR operator is much slower than export operator. My data export requirement is massive.

Teradata Employee

Re: Need Help on using TPT with UNICODE characters

The Export operator executes the FastExport protocol. That procotol returns the data in binary format, not text. Therefore, we cannot use that operator to write out delimited data, which is text.

The Selector operator can be used to retrieve the data in "report" mode, which retrieves the data in text format, and so can be used to create delimited output.

The only way to use the Export operator is for you (the user) to CAST the SELECT statement so all columns are converted (by the DBS) to VARCHAR. If you want to do that, it is up to you, but the operator cannot do that for you.

-- SteveF
Enthusiast

Re: Need Help on using TPT with UNICODE characters

Thanks again, that probably explains why my script is running with EXPORT operator ( I'm casting every column to VARCHAR)

Teradata Employee

Re: Need Help on using TPT with UNICODE characters

Just be careful because the first 2 bytes might be in binary. The normal format that is exported includes a 2-byte row length.

-- SteveF
Enthusiast

Re: Need Help on using TPT with UNICODE characters

It does not include first 2 bytes. In fact, one of the reason going for TPT was to replace fastexport which includes these additonal two bytes and hence additonal processing at our end.

Taking a step back,

2. you must make sure that the size of the VARCHAR columns is in terms of "bytes", not "characters".

This worked in TD13.10 but not working on 12.0 TPT utlitity. It again gives me the error

EXPORT_OPERATOR: aborting due to the following error:

Output Schema does not match data from SELECT statement

Job step MAIN_STEP terminated (status 12)

One more question,

How (If at all) FastExport sessions are differnet from TPT sessions? I read in documentation that number of session is limited by number of AMPs in the system. What is a co-relation between sessions and AMP? I can not understand how AMPs influence number of sessions.

Thanks

Teradata Employee

Re: Need Help on using TPT with UNICODE characters

Ok, I will work backwards.

The first thing to understand is that TPT and the older legacy standalone utilities do the exact same thing.

It is called following (and executing) a particular "protocol".

So, the Export operator executes the FastExport protocol. The protocol describes the set of steps and communications between the client and the database. The sessions are the same. The method of data retrieval is the same.

As to AMPs, the special protocols (FastLoad, MultiLoad, FastExport) connect special "data" sessions that run in special partitions that connect directly to the AMPs.

And our utilities have a max limit as to how many of these special session that can be connected to the database, and that limit is 1-per-AMP.

(Of course, with FastExport -- or TPT Export operator -- noone really connects 1 session per available AMP, but that is the rule.)

Now, back to your schema mismatch problem. To help you further, at a minimum I will need to see your entire script. And a brief description of what you are trying to do.

-- SteveF
Fan

Re: Need Help on using TPT with UNICODE characters

I was faced with the same situation.

If you solve this problem, Could you share me with script ?