Ok, I was finally able to reproduce the problem (#1) on my PC!
I will have to talk to the DBS folks to figure out what is going on, but here is what I know.
When you use the DEFINE SCHEMA <name> FROM TABLE <tname>;
we look at the table and see a DATE column and, by default, create a schema with the column defined as INTDATE.
(By default, Teradata treats dates as integers and even stores the values as integer values.)
When we export the data in that manner, for years that are less than 1000, for some reason Teradata gives us incorrect information.
I do not know why and I will have to investigate this.
However, if you actually explicitly define the schema in the script, and you define the DATE field as "ANSIDATE", and then you set the Export operator attribute DateForm to "ansidate", then the data will be exported correctly.
Ok, a little more information.
When the default method of exporting DATE data is used, Teradata exports the data as integer data.
This is what is happening in this case.
And any year prior to 1900 is technically an invalid year, but looks like Teradata accepts it anyway.
However, a DATE that is externally representated (in character format) as "0001/01/01" is really stored internally as an integer value and the leading 0's are lost.
When the DC operator tries to convert that integer value to a string, there is not enough information to create a string in the form of "YYYY/MM/DD" because of all of the missing leading zeros.
The only way to correctly export the information is to export the data in ANSIDATE format.
However, our schema auto-generation code does not currently take DateForm into consideration when generating the schema and so the columns are always defined as INTDATE.
We will work on a fix for that.
In the meantime, the only workaround is to define the schema explicitly in the script, and provide the DateForm attribute (or ExportDateForm job variable) to the job.
Even a little more information.
This is the formula Teradata uses:
(year-1900) * 10000 + (month*100) + day
Thus, "2015/01/01" would yield:
(2015-1900) * 10000 + (1 * 100) + 1 = 1150101
and "0001/01/01" would yield:
(1-1900) * 10000 + (1 * 100) + 1 = -18990000 + 100 + 1 = -18989899
And we see that the DC operator is really not converting from the decimal value correctly.
It should have taken the result (-18989899) and added 19000000, which would result in 10101, and then convert that to 1/01/01 (and possibly go one step further to add the leading 0's, but the DC operator would not know the exact format you are looking for).
We will fix the DC for this.
1. I tried with the option dateform=ansiDate in job variable file but still the same wrong result.
2. I have another null vs blank issue-
#1. If you read all of my posts, you will see that the DateForm solution originally proposed will not work. You will have to also explicitly provide the schema and define the DATE fields as ANSIDATE.
#2. I did you set DCPQuotedData to 'yes' so that the file reader will process the data correctly and strip the quotes?
(do not confuse "DCC" and "DCP")
I used the below setings in unload file-
When you say this:
unload file content is -
I am assuming for right now that these are the only 4 records that we are talking about.
Also, I just want to confirm with you that none of those fields are NULL.
When QuotedData is 'Yes', a field that is NULL is still supposed to be represented by 2 consecutive delimiters and the existence of "" means the field was a o-length string (a field being NULL, and a field having a 0-length string are 2 different things).
Once you verify that, I will check with engineering.
Please provide me with the exact TPT version and on which platform this job is running.
Table has four records and one field in a row is null, below are the values in table-
while I unload data came as below in file. the NULL value is not coming in "" in second line last field.
The NULL is not supposed to be represented by "".
That is an empty string.
It is valid to not write anything in the 3rd field. The existence of a delimiter and then an end-of-line marker will indicate the column at that position is supposed to be NULL.
However, when the file is then read in the file reader must note that there is an absence of a value for that column position and denote that column as NULL.
We are looking into our code to see what is going on.
However, I still have not heard back on the version of TPT and platform.