If TPT Attributes
INTEGER SkipRows = 1 and
Varchar SkipRowsEveryFile = 'N' (There are more than 1 file), how will the loader know, for which file the row is to be skipped? How will the loader behave if it faces such scenario?
Or is it compulsory to have Varchar SkipRowsEveryFile = 'Y' (when there are more than 1 file) if SkipRows has
value >= 1.
Thanks & Regards,
The DataConnector operator is able to read from multiple files at one time.
The nature of parallelism implies that order is not important.
If order is important, you cannot read multiple files in parallel.
If order is important then you should only be using 1 instance of the operator, and you should be providing a mechanism by which the DC operator can process the files in a specific order (in other words, we support reading files in order according to the timestamp by which the files were placed into the directory/folder).
Or you can use the FileList feature where you create a file with the list of files to process and with a single instance of the DC operator we will process those files in order.
Having said that, the SkipRows feature allows for files with the same structure ('x' number of header records followed by the data records) to be processed where the header records can be skipped. This is where SkipRowsEveryFile='Y' (and parallelism) comes into play.
Some users have a situation where only one of the files has header records, and the rest of the files only have data records. For this scenario, SkipRowsEveryFile must be 'N', but the DC operator cannot also process the files in parallel.
According to the documentation, the default for SkipRowsEveryFile is 'No'.
Thanks for clarifying Steve.
I have another question.
Let's say, there are 5 files and the files are loaded in serial order (i.e. it doesn't get loaded parallely). Out of 5 files, 2 files have header and the rest does not. My attributes are
SkipRows = 1 and
SkipRowsEveryFile = 'N'. Will the loader be able to identify from which file the 1st row is to be skipped?
Teradata Parallel Transporter SQL DDL Operator Version 15.00.00.05
DDL_OPR_table: private log specified: table_log
DDL_OPR_table: connecting sessions
DDL_OPR_table: sending SQL requests
DDL_OPR_table: Rows Deleted: 0
DDL_OPR_table: disconnecting sessions
DDL_OPR_table: Total processor time used = '0.2 Second(s)'
DDL_OPR_table: Start : Mon Aug 29 17:13:29 2016
DDL_OPR_table: End : Mon Aug 29 17:13:31 2016
Job step DROP_ERR_TABLES completed successfully
Teradata Parallel Transporter DataConnector Operator Version 15.00.00.05
DC_P_table: Instance 1 directing private log report to 'Read-1'.
Teradata Parallel Transporter Load Operator Version 15.00.00.05
LOAD_OPR_table: private log specified: Read
DC_P_table: DataConnector Producer operator Instances: 1
DC_P_table: ECI operator ID: 'DC_P_table-19675'
DC_P_table: Operator instance 1 processing file '/prod_wk5/proc/S_PROD.F.20160824.1021.dat'.
LOAD_OPR_table: connecting sessions
LOAD_OPR_table: preparing target table
LOAD_OPR_table: entering Acquisition Phase
DC_P_table: TPT19435 pmRead failed. EOF encountered before end of record (35)
DC_P_table: TPT19305 Fatal error reading data.
DC_P_table: TPT19015 TPT Exit code set to 12.
LOAD_OPR_table: disconnecting sessions
DC_P_table: Total files processed: 0.
DC_P_table: TPT19229 0 error rows sent to error file table.err
LOAD_OPR_table: Total processor time used = '9.76 Second(s)'
LOAD_OPR_table: Start : Mon Aug 29 17:13:35 2016
LOAD_OPR_table: End : Mon Aug 29 17:13:50 2016
Job step LOAD_TABLE terminated (status 12)
Job table.log.20160829.171005.tpt terminated (status 12)
Job start: Mon Aug 29 17:13:27 2016
Job end: Mon Aug 29 17:13:50 2016
Error encountered loading file exiting...
TPT loader throwing this error while trying to load file having no data records with header .
I have to load a file which may or may have recods but they send us with header even they do not have data in it .
please let me know how to solve this .