BTEQ Output: Unicode Column wider as defined in DDL or FORMAT

Tools & Utilities
Enthusiast

Re: BTEQ Output: Unicode Column wider as defined in DDL or FORMAT

This thread is stale, but I've found no reasonable solution to the issue where BTEQ outputs 3x the number of characters necessary when the session's charset is set to 'UTF8'. I understand fully that UTF8 requires 3x the number of bytes in order to store the BOM for each character, but it seems odd that BTEQ translates 3x the bytes to 3x the characters in output. 

Just the same, if you are executing BTEQ from a bash script for export, you can use a simple sed command to remove repeated spaces in a file. This will remove ALL repeated spaces, regardless of whether they are quote encapsulated or not. If you wish to get fancier, then you can probably run similar regex through awk.

cat inputfile | sed 's/^ *//;s/ *$//;s/ \{1,\}/ /g' > outputfile
Teradata Employee

Re: BTEQ Output: Unicode Column wider as defined in DDL or FORMAT

On the server you can override the default export-width at the session level. For example, you could set the Unicode to UTF8 width to 2 bytes per character if your worst case UTF8 export width is Latin based scripts (i.e., U+07FF and below). There is information in the reference manuals on export width.

Also the server does not import or export the byte-order mark (BOM).  For more info see: http://developer.teradata.com/tools/articles/whats-a-bom-and-why-do-i-care