TPT - source data enclosed with double quotes and delimited by pipe (|)

Tools
Enthusiast

TPT - source data enclosed with double quotes and delimited by pipe (|)

Hi,

I am new to Teradata. I am trying to load the data from csv file to Teradata 13.0 on VMWARE using TPT, where in columns are enclosed with double quotes (") and delimited with pipe ( | ).

e.g.

 "1001"|"Amit"|"FIN"|"10001"|"1981-10-10"

"1002"|"Manish"|"FIN"|"25000"|"1990-01-01"

"1003"|"Sid"|""|"2000"|"1986-04-22"

"1004"|"Macho"|"GM"|""|"1972-05-19"

And I am using the below script - 

DEFINE JOB LOAD_M1_TABLE_FROM_FILE

DESCRIPTION 'LOAD SAMPLE M1 TABLE FROM A FILE'

(

  DEFINE SCHEMA M1_SCHEMA

  DESCRIPTION 'SAMPLE M1 SCHEMA'

  (

   EMP_ID   varchar(12),

   EMP_NAME varchar(12),

   EMP_DEPT varchar(5),

   SALARY   varchar(12),

   DOB      varchar(12)

  );

  DEFINE OPERATOR DDL_OPERATOR()

  DESCRIPTION 'TERADATA PARALLEL TRANSPORTER DDL OPERATOR'

  TYPE DDL

  ATTRIBUTES

  (

  VARCHAR PrivateLogName = 'tpt_script_ddloper_log',

  VARCHAR TdpId          = 'dbc',

  VARCHAR UserName       = 'dbadmin',

  VARCHAR UserPassword   = 'dbadmin',

  VARCHAR AccountID,

  VARCHAR ErrorList      = '3807'

  );

  DEFINE OPERATOR LOAD_OPERATOR()

  DESCRIPTION 'TERADATA PARALLEL TRANSPORTER LOAD OPERATOR'

  TYPE LOAD

  SCHEMA M1_SCHEMA

  ATTRIBUTES

  (

   VARCHAR PrivateLogName    = 'tpt_script_loadoper_privatelog',

   INTEGER MaxSessions       =  32,

   INTEGER MinSessions       =  2,

   VARCHAR TargetTable       = 'M1',

   VARCHAR TdpId             = 'dbc',

   VARCHAR UserName          = 'dbadmin',

   VARCHAR UserPassword      = 'dbadmin',

   VARCHAR AccountId,

   VARCHAR ErrorTable1       = 'tpt_script_LOADOPER_ERRTABLE1',

   VARCHAR ErrorTable2       = 'tpt_script_LOADOPER_ERRTABLE2',

   VARCHAR LogTable          = 'tpt_script_LOADOPER_LOGTABLE'

  );

  DEFINE OPERATOR FILE_READER()

  DESCRIPTION 'TERADATA PARALLEL TRANSPORTER DATA CONNECTOR OPERATOR'

  TYPE DATACONNECTOR PRODUCER

  SCHEMA M1_SCHEMA

  ATTRIBUTES

  (

   VARCHAR PrivateLogName    = 'tpt_script_dataconnoper_reader_privatelog',

   VARCHAR FileName          = 'code*.txt',

   VARCHAR IndicatorMode     = 'N',

   VARCHAR OpenMode          = 'Read',

   VARCHAR Format            = 'delimited',

   VARCHAR NullColumns       = 'Y',

   VARCHAR OpenQuoteMark = '"',

   VARCHAR CloseQuoteMark = '"',


   VARCHAR TextDelimiter     = '|',

   VARCHAR trimchar          = '"',

   VARCHAR MultipleReaders   = 'Y',

   varchar trimcolumns       = 'both',

   VARCHAR SkipRowsEveryFile = 'Y',

   INTEGER SkipRows = 1

  );

  STEP setup_tables

  (

    APPLY

    ('DROP TABLE M1;'),

    ('DROP TABLE M1_LOADOPER_ERRTABLE1;'),

    ('DROP TABLE M1_LOADOPER_ERRTABLE2;'),

    ('DROP TABLE M1_LOADOPER_LOGTABLE;'),

    ('CREATE TABLE M1(EMP_ID   varchar(10), EMP_NAME varchar(10), EMP_DEPT varchar(3), SALARY   varchar(10), DOB varchar(10));')

    TO OPERATOR (DDL_OPERATOR() );

  );

  STEP load_data_from_file

  (

    APPLY

     ( 'INSERT INTO M1(:EMP_ID, :EMP_NAME, :EMP_DEPT, :SALARY, :DOB);')

     TO OPERATOR (LOAD_OPERATOR() [1] )

     SELECT * FROM OPERATOR (FILE_READER() [1] );

  );

 );

Data is getting loaded but double quotes are getting included in data.

In Teradata data is as below - 

EMP_ID      EMP_NAME    EMP_DEPT  SALARY      DOB

----------  ----------  --------  ----------  ----------

"1004"      "Macho"     "GM       ""          "1972-05-1

"1002"      "Manish"    "FI       "25000"     "1990-01-0

"1007"      "Sidhu"     ""        "2000"      "1986-04-2

"1008"      "Mach"      "GM       ""          "1972-05-1

"1003"      "Sid"       ""        "2000"      "1986-04-2

"1006"      "Mani"      "FI       "25000"     "1990-01-0

If column length is less, then the data is trimmed before getting loaded. e.g. column EMP_DEPT and DOB.

Can you please help me out urgently. I am stuck up on this for last 3 days.

Regards

Atul


16 REPLIES
Teradata Employee

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

You must enable the quoted data feature by setting the QuotedData attribute in the DataConnector attribute to 'yes' or 'optional'.

Please refer to the documentation for details.

-- SteveF
Enthusiast

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

Hi Steve,

Thanks for the reply. However, I had tried that option as well by specifying below -

VARCHAR QuotedData = 'Y',

and also I tried - 

VARCHAR QuotedData = 'Optional'

but none is working. 

Please help.

Regards

Atul

Teradata Employee

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

Sorry, I forgot to ask what version of TPT you are using.

The QuotedData feature was introduced in TPT 14.0.

If you are using a version prior to that, then the quotes will always be part of the data.

-- SteveF
Enthusiast

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

I am using Teradata 13.0 express edition on VMWARE.

Is there any other way to get the desired result?

Thanks

Atul

Teradata Employee

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

The only way is to upgrade to a newer version of TPT.

(TPT 13.0 is not a supported release anymore.)

Or pre-process the data and strip the quotes.

-- SteveF
Enthusiast

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

Also, let me know if TPT 13.0 version supports the .gz (zipped files).

Can you please provide any link to compare the features/differences between 13.0 and 14.0 versions.

Thanks

Atul

Teradata Employee

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

Compress file supprot was implemented in 14.10.

-- SteveF
Enthusiast

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

Hi Steve,

I installed 14.1 and was able to load the file with comma seperated fields, optionally enlcosed with double quotes through TPT

sample data - 

123, "Manish","FINANCE",10000,25

12, "James","HR",5000,30

Can you please help me below - 

1. Is there any other method (other than TPT) which also support - comma separated data, and optionally enclosed with double quotes, like example I specified above.

2. In TPT, I am not sure, but there is limitation that I can load the data only to varchar fields in the table, but what if I have number/date fields as well.

3. When I am using FLOAD, I can specify - SET RECORD VARTEXT DELIMITER ';' QUOTE YES '"';

   but issue is that every column data must be enclosed with "  but in my data it is optional.   Is there any way to get it done?

4. I am using 14.1 express edition on vmware. There the no. of Instance is - 1, AMP - 2, and the maximum sessions that can be called - 2.     Is there any way I can increase the number of AMP's?   And if Yes, what is the way to do the same (All steps)

Thanks in Advance

Atul

Enthusiast

Re: TPT - source data enclosed with double quotes and delimited by pipe (|)

5. Also, Can you please let me know whether we can skip record (HEADER) with FASTLOAD. RECORD 2 is not working.

6. Can we use TDLOAD to load the above type of data - comma separated with optionally enclosed with " and also need to skip the first row (header).