I am trying to split large table (55Mn rows) into smaller csv files 1 Mn each. But TPT is failing when I try to do it in one run. I get error mentioned below. However when I use row number and limit rows as 10 Mn at a time and export 10 files, which made me execuet TPT script 5-6 times.
Somtimes I also chnaged few variables, which gave me errors as could not establish special session something.
I thought someone out there would know how this could be done in one shot. so sharing details below.
C:\tpt\exports>tbuild -f test.tpt -C
Teradata Parallel Transporter Version 15.00.00.01
Job log: C:\Program Files\Teradata\client\15.00\Teradata Parallel Transporter/logs/user-112.out
Job id is user-112, running on MUS263646
OS_Spawnvp(): Cannot create new process. Maximum Waitable Object limit, 64, exceeded
TTPPT_INFRA: TPT01024: Error: Failed to create process executor.exe, System errno: 0
T_INFRA: TPT01149: Error: Spawning a new process is failed. Return code: 8
USING CHARACTER SET ASCII
DEFINE JOB eml_job
DESCRIPTION 'export ABC'
DEFINE SCHEMA SCHEMA_extract (
DEFINE OPERATOR EXPORT_OPERATOR
VARCHAR DateForm = 'ANSIDATE',
VARCHAR SelectStmt = 'SELECT email_adr_txt FROM DB.table_with_54_million_records',
DEFINE OPERATOR FILE_WRITER
TYPE DATACONNECTOR CONSUMER
APPLY TO OPERATOR (FILE_WRITER)
FROM OPERATOR (EXPORT_OPERATOR);
Why are you specifying 55 instances for each operator?
For the Export operator, you need 1 or 2 at the most.
For the file writer, writing to 55 files will definitely not provide you with any type of performance, and you might also be reaching some type of resource limit.
You are getting the error because you are specifying 55 instances for each operator, resulting in 55x55=3025 data streams.
Thanks for the reply Steve.
I did mention 55 instances of file writer since my requirement is to export 55 split files for particular table.
Please correct me if there is a way to do that.
55 does seem like a lot, but I understand those are requirements.
Still, you cannot use 55 instances of the Export as 55x55 data streams is too many system resources.
Try 1 instance of the Export operator and see if the job runs.
The file writing will most likely always be slower than pulling data out from the DBS.
And to speed up performance, run the Export operator in NoSpool mode.
I tried 10 instances of file writer and split with row number 1-x0 and x1-y0..and onwards was able to split and export 10 files ..running this 6 times. but wanted to know if there is no way to export 55 files in one execution in TPT. Thats okay, I was just curious if there was another way.
Really appreciate steve for clarification!
You can try 55 instances of the file writer (and only 1 instance of the Export) and see what happens.
TPT does not have hard-coded limit. It all depends on your system resources.
In other words, if 1 x 55 still fails, you could talk to your server admin about raising the OS limits.
Thanks Steve and Fred.
But keeping 55 fiel wroter and 1 or 5 or 10 exports results in the same failure.
I think I will stick to spliting the select statement with row number and export 10 files at a time. that seems to work.
Again (and I cannot stress this enough), if you are going to use a high number of instances of the file writer, then only use 1 instance of the Export operator. The number of shared memory data streams being used for the job to transfer data from the producer (Export) operator to the consumer operator (file writer) is a multiplication of the number of producer and consumer instances (not addition).
If 55 instances of the file writer leads to the same error, then lower it gradually until you find the number that your operating system will allow. You might be able to use 30 or 40. This will then lead to fewer TPT jobs to get the job done.