TPT UTF8 Export

Tools & Utilities
Teradata Employee

TPT UTF8 Export

Hi,

 

I recently had a requirement to export data to files in UTF8 format.

 

I've created TPT scripts with the "USING CHARACTER SET UTF8" header and created the output schema with double/triple the data length format of the source. Now after executing (UNIX) and checking the output file, it is being created as ISO-8995 and not UTF8.

 

 

Here's an example of the TPT content

 

USING CHARACTER SET UTF8
DEFINE JOB sample
...
...
DEFINE SCHEMA FILE_OUT DESCRIPTION 'schema for output file' (      COL1 VARCHAR(300),      COL2 VARCHAR(200),      ...  ) DEFINE OPERATOR Producer_Query TYPE EXPORT SCHEMA FILE_OUT ATTRIBUTES (      VARCHAR UserName=**bleep**,      VARCHAR Pass=**bleep**,      VARCHAR SelectStmt='SELECT CAST(COL1 AS VARCHAR(50)), CAST(COL2 AS VARCHAR(50)) ... ... FROM TABLE;' )

Did I miss anything?

 

Please help. 

 

Thanks!

RA

 

 

Tags (2)
3 REPLIES
Teradata Employee

Re: TPT UTF8 Export

How are you determining that the output is ISO-8859? Does the text contain extended characters?

If all the bytes are in the x'00'-x'7F' range, there is no difference between ISO-8859-1 and UTF-8.

 

Teradata Employee

Re: TPT UTF8 Export

I'm running the file command in UNIX to get the file character set.

 

For the recrod/data characters - they are all normal/plain text without any extended characters; which I think why the exported file is still coming out as ISO-8895. The only character which is unusual is my delimiter "§" which is upon searching still falls under ISO-8895.

 

So maybe just to conclude or confirm (please correct) - if the CHAR SET UTF8 is used in TPT script but exported data doesn't have extended characters, it will not create the file as UTF8?

 

Thanks!

 

Teradata Employee

Re: TPT UTF8 Export

ASCII is a subset of UTF8, and I suspect ISO-8895 is as well.

If there are no extended characters, the data file is all single-byte ASCII?

If so, that is still considered UTF8.

-- SteveF