unloaded Date value issue by tpt script

Tools & Utilities
Teradata Employee

Re: unloaded Date value issue by tpt script

Ok, I was finally able to reproduce the problem (#1) on my PC!

I will have to talk to the DBS folks to figure out what is going on, but here is what I know.

When you use the DEFINE SCHEMA <name> FROM TABLE <tname>;

we look at the table and see a DATE column and, by default, create a schema with the column defined as INTDATE.

(By default, Teradata treats dates as integers and even stores the values as integer values.)

When we export the data in that manner, for years that are less than 1000, for some reason Teradata gives us incorrect information.

I do not know why and I will have to investigate this.

However, if you actually explicitly define the schema in the script, and you define the DATE field as "ANSIDATE", and then you set the Export operator attribute DateForm to "ansidate", then the data will be exported correctly.

-- SteveF
Teradata Employee

Re: unloaded Date value issue by tpt script

Ok, a little more information.

When the default method of exporting DATE data is used, Teradata exports the data as integer data.

This is what is happening in this case.

And any year prior to 1900 is technically an invalid year, but looks like Teradata accepts it anyway.

However, a DATE that is externally representated (in character format) as "0001/01/01" is really stored internally as an integer value and the leading 0's are lost.

When the DC operator tries to convert that integer value to a string, there is not enough information to create a string in the form of "YYYY/MM/DD" because of all of the missing leading zeros.

The only way to correctly export the information is to export the data in ANSIDATE format.

However, our schema auto-generation code does not currently take DateForm into consideration when generating the schema and so the columns are always defined as INTDATE.

We will work on a fix for that.

In the meantime, the only workaround is to define the schema explicitly in the script, and provide the DateForm attribute (or ExportDateForm job variable) to the job.

-- SteveF
Teradata Employee

Re: unloaded Date value issue by tpt script

Even a little more information.

This is the formula Teradata uses:

(year-1900) * 10000 + (month*100) + day

Thus, "2015/01/01" would yield:

(2015-1900) * 10000 + (1 * 100) + 1 = 1150101

and "0001/01/01" would yield:

(1-1900) * 10000 + (1 * 100) + 1 = -18990000 + 100 + 1 = -18989899

And we see that the DC operator is really not converting from the decimal value correctly.

It should have taken the result (-18989899) and added 19000000, which would result in 10101, and then convert that to 1/01/01 (and possibly go one step further to add the leading 0's, but the DC operator would not know the exact format you are looking for).

We will fix the DC for this.

-- SteveF
Enthusiast

Re: unloaded Date value issue by tpt script

Thanks Steve, 

1. I tried with the option dateform=ansiDate in job variable file but still the same wrong result.

2. I have another null vs blank issue-

Table fields

  MIS_DT DATE FORMAT 'YYYY-MM-DD',

  NM_TXT VARCHAR(100) CHARACTER SET UNICODE NOT CASESPECIFIC not null,

  nm_surname VARCHAR(100) CHARACTER SET UNICODE NOT CASESPECIFIC) 

table values

2015-01-21|abc|def

2015-01-31|BLANK|lmn

2015-01-01|ghi|ijk

2015-02-01|BLANK|NULL

I want to unload and then load the file to another same structure table.

while using below parameter-

,DCCCloseQuoteMark='"'

,DCCOpenQuoteMark = '"'

,DCCQuotedData='Yes'

Data is unloaded correctly like below

"2015/01/21"|"abc"|"def"

"2015/01/31"|""|"lmn"

"2015/01/01"|"ghi"|"ijk"

"2015/02/01"|""|

But while loading this file all the records are getting rejected due to date format issue.

Next I tried with below option-

,DCCCloseQuoteMark='"'

,DCCOpenQuoteMark = '"'

,DCCQuotedData='Optional'

It is unloading data as -

2015/01/21|abc|def

2015/01/31|""|lmn

2015/01/01|ghi|ijk

2015/02/01|""|

But during loading this file, it's loading all data but blank values are loaded as "". So in table instead  of loading blank values, it's loading double quote as value.


Can you please advise with which setting I can unload and load data from that table.






Teradata Employee

Re: unloaded Date value issue by tpt script

#1. If you read all of my posts, you will see that the DateForm solution originally proposed will not work. You will have to also explicitly provide the schema and define the DATE fields as ANSIDATE.

#2. I did you set DCPQuotedData to 'yes' so that the file reader will process the data correctly and strip the quotes?

(do not confuse "DCC" and "DCP")

-- SteveF
Enthusiast

Re: unloaded Date value issue by tpt script

Hi Steve,

I used the below setings in unload file-

,DCCCloseQuoteMark='"'

,DCCOpenQuoteMark = '"'

,DCCQuotedData='Yes'

and load job variable is as-

,DCPCloseQuoteMark='"'

,DCPOpenQuoteMark = '"'

,DCPQuotedData='Yes'

unload file content is -

"2015/01/21"|"palash"|"dhara"

"2015/02/01"|""|

"2015/01/01"|"abhishek"|"dhara"

"2015/01/31"|""|"arora"

But while loading, below 2 records got rejected due to field NM_TXT (field name found in error table and reason looks like middle field treated as null but table ddl defined as not null.)

"2015/02/01"|""|

"2015/01/31"|""|"arora"

After that I tried with option ,DCPNullColumns='No'

But here all records are loaded. but the last field value(nm_surname ) for the below record loaded as blank instead of null. In my source it is null not blank.

"2015/02/01"|""|


Teradata Employee

Re: unloaded Date value issue by tpt script

When you say this:

unload file content is -

"2015/01/21"|"palash"|"dhara"

"2015/02/01"|""|

"2015/01/01"|"abhishek"|"dhara"

"2015/01/31"|""|"arora"

I am assuming for right now that these are the only 4 records that we are talking about.

Also, I just want to confirm with you that none of those fields are NULL.

When QuotedData is 'Yes', a field that is NULL is still supposed to be represented by 2 consecutive delimiters and the existence of "" means the field was a o-length string (a field being NULL, and a field having a 0-length string are 2 different things).

Once you verify that, I will check with engineering.

Please provide me with the exact TPT version and on which platform this job is running.

-- SteveF
Enthusiast

Re: unloaded Date value issue by tpt script

Table has four records and one field in a row is null, below are the values in table-

2015-01-21|Palash|Dhara

2015-01-31|BLANK|Arora

2015-01-01|Abhishek|Dhara

2015-02-01|BLANK|NULL

 

while I unload data came as below in file. the NULL value is not coming in "" in second line last field.

"2015/01/21"|"palash"|"dhara"

"2015/02/01"|""|

"2015/01/01"|"abhishek"|"dhara"

"2015/01/31"|""|"arora"

Teradata Employee

Re: unloaded Date value issue by tpt script

The NULL is not supposed to be represented by "".

That is an empty string.

It is valid to not write anything in the 3rd field. The existence of a delimiter and then an end-of-line marker will indicate the column at that position is supposed to be NULL.

However, when the file is then read in the file reader must note that there is an absence of a value for that column position and denote that column as NULL.

We are looking into our code to see what is going on.

However, I still have not heard back on the version of TPT and platform.

-- SteveF
Enthusiast

Re: unloaded Date value issue by tpt script

platform is AIX 6.1

Teradata Parallel Transporter Version 14.10.00.05