FastLoad fails loading TSV file "Column length overflow(s)"

Tools & Utilities
Visitor

FastLoad fails loading TSV file "Column length overflow(s)"

Greetings, I have a Tab seperated file and the following is a snippet from the file:

 

 

111893744    2013/01/17    2    4    1973-09-XX    43    1    1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1
111893755    2013/01/17    1    -1    1901-10-XX    0    2    3    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1
222893866    2013/01/17    2    3    1987-03-XX    30    9    44    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1

I have following FastLoad script to load this data :

 

 

 

.LOGON xxxxx/SCR_dev_cdna,yyyy;

DATABASE DEV_CDNA;

DROP TABLE general_demography;
DROP TABLE general_demography_E1;
DROP TABLE general_demography_E2;
CREATE MULTISET TABLE general_demography ,NO FALLBACK ,
        NO BEFORE JOURNAL,
        NO AFTER JOURNAL,
        CHECKSUM = DEFAULT,
        DEFAULT MERGEBLOCKRATIO
        (
         easy_id INT
        ,registration_dt VARCHAR(15)
        ,reg_gender_cd INT
        ,reg_generation_cd INT
        ,reg_birthdate VARCHAR(15)
        ,reg_age INT
        ,reg_region_cd INT
        ,reg_prefecture_cd INT
        ,reg_education_level_cd INT
        ,reg_employer_industry_cd INT
        ,reg_occupation_cd INT
        ,reg_income_level_cd INT
        ,reg_driver_license_status_cd INT
        ,reg_marital_status_cd INT
        ,reg_residents_in_household_cd INT
        ,reg_children_in_household_cd INT
        ,reg_residence_type_cd INT
        ,reg_car_ownership_cd INT
        )
PRIMARY INDEX ( easy_id );

SET RECORD VARTEXT "    ";

BEGIN LOADING general_demography ERRORFILES general_demography_E1,general_demography_E2;

DEFINE easy_id (VARCHAR(15))
        ,registration_dt (VARCHAR(15))
        ,reg_gender_cd (VARCHAR(15))
        ,reg_generation_cd (VARCHAR(15))
        ,reg_birthdate (VARCHAR(15))
        ,reg_age (VARCHAR(15))
        ,reg_region_cd (VARCHAR(15))
        ,reg_prefecture_cd (VARCHAR(15))
        ,reg_education_level_cd (VARCHAR(15))
        ,reg_employer_industry_cd (VARCHAR(15))
        ,reg_occupation_cd (VARCHAR(15))
        ,reg_income_level_cd (VARCHAR(15))
        ,reg_driver_license_status_cd (VARCHAR(15))
        ,reg_marital_status_cd (VARCHAR(15))
        ,reg_residents_in_household_cd (VARCHAR(15))
        ,reg_children_in_household_cd (VARCHAR(15))
        ,reg_residence_type_cd (VARCHAR(15))
        ,reg_car_ownership_cd (VARCHAR(15))
FILE=/home/bisuser/data/.tmp//PDCIbi__cdna_hive_to_spdb_general_demography.tsv;
RECORD 2;

INSERT INTO general_demography (
easy_id
,registration_dt
,reg_gender_cd
,reg_generation_cd
,reg_birthdate
,reg_age
,reg_region_cd
,reg_prefecture_cd
,reg_education_level_cd
,reg_employer_industry_cd
,reg_occupation_cd
,reg_income_level_cd
,reg_driver_license_status_cd
,reg_marital_status_cd
,reg_residents_in_household_cd
,reg_children_in_household_cd
,reg_residence_type_cd
,reg_car_ownership_cd
)
VALUES
(
:easy_id
,:registration_dt
,:reg_gender_cd
,:reg_generation_cd
,:reg_birthdate
,:reg_age
,:reg_region_cd
,:reg_prefecture_cd
,:reg_education_level_cd
,:reg_employer_industry_cd
,:reg_occupation_cd
,:reg_income_level_cd
,:reg_driver_license_status_cd
,:reg_marital_status_cd
,:reg_residents_in_household_cd
,:reg_children_in_household_cd
,:reg_residence_type_cd
,:reg_car_ownership_cd
);
END LOADING;
LOGOFF;
.QUIT;

 

I use the delimiter character  "tab" for "SET RECORD VARTEXT" accordance with the TSV file. (not four spaces)

 

In all my tables, all the "string" columns only store date values with the format "yyyy-mm-dd" reaching the max chars of 10. 

 

When I try to load data with above script , it gives following error :

**** 19:12:58 Error on piom GET ROW: 60, Text: Column length error, row
              not returned !ERROR! Delimited Data Parsing error: Column
              length overflow(s) in row 1 for column 4
**** 19:12:58 Error at record number 1

If I change all VARCHAR(15) to VARCHAR(20) in the DDL section and also in the DEFINE section, this works fine.But again, similar scripts

But again, similar scripts fail for other tables. (all tables have similar data values, only the column names, and number of columns changes) Means for another table I have to change the value to some higher value, such as  VARCHAR(30) ..etc

 

But this error doesn't have a logical explanation, since none of the columns overflow values coming from the TSV.

 

Any tips to fix this issue ?

 

 

 

 

 

Tags (1)
1 REPLY
Teradata Employee

Re: FastLoad fails loading TSV file "Column length overflow(s)"

With FastLoad VARTEXT, parsing the data into the DEFINE layout always happens before the RECORD statement is processed. In other words, you can skip loading a header row, but you can't skip processing it.

 

Options are to make the DEFINE lengths large enough to hold the header text, or to strip the header from the file before passing it to FastLoad (using some external utility or an INMOD/AXSMOD routine). Or switch to using Teradata Parallel Transporter - the TPT Data Connector operator doesn't parse skipped records.