Greek Characters with FastLoad

Tools & Utilities

Greek Characters with FastLoad

Hi,

 

I'm fairly new to the Teradata world, although I have more than 10 years of experience with DB Development. I have one source file that needs to be loaded to Teradata - this file contains, amongst others, German and Greek language characters.

 

I've done a bit of reading regarding character sets in Teradata, and according to what I've read, I've set the source file encoding to UTF8. When calling FastLoad, I invoke the program with the "-c UTF 8" flag in order to force the encoding to UTF8.

 

Would like to highlight that my .NET/ODBC drivers are set for UTF8 sessions as well. Now, after loading the file, the German characters (ß, ö, ü, ä) are displayed as they should in SQL assistant, but I can't say the same for the Greek characters... ALL Greek characters get displayed as �. 

 

This is the underlying DDL for the target table (altough being created through FastLoad):

 

CREATE MULTISET TABLE DwSandpitRomi.SEO_MARKETING_DAILY_TMP ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
REFERENCE_DATE VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
DOMAIN_URL VARCHAR(100) CHARACTER SET LATIN NOT CASESPECIFIC,
SEARCH_QUERY VARCHAR(500) CHARACTER SET LATIN NOT CASESPECIFIC,
AVERAGE_POSITION FLOAT,
CLICKS INTEGER,
IMPRESSIONS INTEGER)
NO PRIMARY INDEX ;

 

The FastLoad script is the following:

 

sessions 2;

LogMech LDAP;
logon *************************************;

DROP TABLE DwSandpitRomi.SEO_MARKETING_DAILY_TMP;

CREATE MULTISET TABLE DwSandpitRomi.SEO_MARKETING_DAILY_TMP
(
REFERENCE_DATE VARCHAR(10),
DOMAIN_URL VARCHAR(100),
SEARCH_QUERY VARCHAR(200),
AVERAGE_POSITION FLOAT,
CLICKS INTEGER,
IMPRESSIONS INTEGER)
NO PRIMARY INDEX ;

SET RECORD VARTEXT "<|>";

DEFINE
REFERENCE_DATE VARCHAR(10),
DOMAIN_URL VARCHAR(100),
SEARCH_QUERY VARCHAR(200),
AVERAGE_POSITION FLOAT,
CLICKS INTEGER,
IMPRESSIONS INTEGER

FILE=C:\ROMI\Andre\SEO\Full Extract\2017\Dec 2017.csv

INSERT INTO DwSandpitRomi.SEO_MARKETING_DAILY_TMP(
:REFERENCE_DATE,
:DOMAIN_URL,
:SEARCH_QUERY,
:AVERAGE_POSITION,
:CLICKS INTEGER,
:IMPRESSIONS
);

END LOADING;

.LOGOFF;

 

Can any help with this one? This is starting to make me feel crazy...

 

Many thanks,

André

3 REPLIES 3
Highlighted
Junior Contributor

Re: Greek Characters with FastLoad

Your target table is defined with LATIN columns, but you need UNICODE.

 

Simply override the default:

CREATE MULTISET TABLE DwSandpitRomi.SEO_MARKETING_DAILY_TMP
(
REFERENCE_DATE VARCHAR(10),
DOMAIN_URL VARCHAR(100) CHARACTER SET UNICODE,
SEARCH_QUERY VARCHAR(200) CHARACTER SET UNICODE,
AVERAGE_POSITION FLOAT,
CLICKS INTEGER,
IMPRESSIONS INTEGER)
NO PRIMARY INDEX ;

Re: Greek Characters with FastLoad

Hi,

 

The data does get loaded now, but I can't make sense of the characters being displayed:

 

DE - süper lig wetten

GR - ÎºÎ¿Ï÷Ï Î¿Î½Î¹ ϸÏ×οιχημαÏ×οϸ

 

And joins do work. Could this be my .NET charset definition? It's currently set for UTF8 as well.

 

Many thanks,

André

Junior Contributor

Re: Greek Characters with FastLoad

Those strange characters seem to be a display problem of correctly loaded data.

 

You need to

- specify the correct encoding of the input file (is it actually UTF-8 or maybe UTF-16)?

- set your .NET/ODBC session to Unicode (either UTF-9 or UTF-16)

- use a character set which can actually display all Unicode characters (most of the pre-Windows 10 fonts can't)

 

Seems you did all of those...

 

You still have a bad input definition, if you use VARTEXT every input field must be VARCHAR (and the column length for UTF-8 must be tripled:

DEFINE
REFERENCE_DATE VARCHAR(10),
DOMAIN_URL VARCHAR(300),
SEARCH_QUERY VARCHAR(600),
AVERAGE_POSITION VARCHAR(20),
CLICKS VARCHAR(11),
IMPRESSIONS  VARCHAR(11)

Btw, using TPT would be probably easier.

Create the table first and then use

tdload -c UTF-8 -d"<|>" -f"C:\ROMI\Andre\SEO\Full Extract\2017\Dec 2017.csv" -t SEO_MARKETING_DAILY_TMP --TargetWorkingDatabase DwSandpitRomi -S -h yoursystem -u youruser -p yourpassword

 

The -S option saves the created scipt in the logs directory of your TPT install, probably

C:\Program Files\Teradata\Client\xx.xx\Teradata Parallel Transporter\logs

 

If this works you can use a generic TPT script and simply pass somejob variables