Hi
Can i execute multiple scripts with different producers(source tables) and different consumers(target files) simultaneously to get a better perfomance.
Thanks in advance
Better performance than what?
Hi Steve
I have the follwoing TPT script written to get table data exported to one file. I would like to know if it is possible to export into multiple files to get a better throughput, as of now it takes 1 minute to export .
USING CHARACTER SET UTF8
DEFINE JOB export_emea_analytics_eu_deal_flat
(
DEFINE SCHEMA tpt_tab_export_schema FROM SELECT DELIMITED OF OPERATOR tpt_reader;
DEFINE OPERATOR tpt_reader
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER EXPORT OPERATOR'
TYPE EXPORT
SCHEMA tpt_tab_export_schema
ATTRIBUTES
(
VARCHAR USERNAME = 'user'
,VARCHAR SELECTSTMT = 'SELECT * FROM tpt_table;'
,VARCHAR USERPASSWORD = '********'
,VARCHAR TDPID = 'ip-address'
);
DEFINE OPERATOR tpt_writer
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER FLAT FILE WRITER'
TYPE DATACONNECTOR CONSUMER
SCHEMA *
ATTRIBUTES
(
VARCHAR OPENMODE = 'Write'
,VARCHAR DIRECTORYPATH = '/home/dev/tpt/test/'
,VARCHAR FILENAME = 'tpt_test.txt'
,VARCHAR INDICATORMODE = 'N'
,VARCHAR QUOTEDDATA = 'N'
,VARCHAR FORMAT = 'DELIMITED'
,VARCHAR TEXTDELIMITER = '|'
,INTEGER BUFFERSIZE = 524288
);
APPLY TO OPERATOR (tpt_writer)
SELECT * FROM OPERATOR (tpt_reader);
);
Thanks in advance
Yes, it is possible.
Specify multiple instances of the DataConnector operator (the file writer).
Each instance will write to a separate file.
And make sure to specify the -C (uppercase) command line option so that the data is sent to the DC operator instances in a round-robin fashion. You will then get a roughly even (not exact) distribution of rows per file.
Thanks Steve for the information,
Currently we are executing as follows: tbuild -f <filename>
Should the command line be changed as follows for the round-robin fashion ?
tbuild -C <filename>
It needs to be
tbuild -f <script-name> -C
Thanks Steve for the response..
Need clarity on the multithread concept in TPT..
I am aware that we can perform exporting of multiple tables by creating more instance of the producers and the consumers in the same script. But is it done serially or parallely? And may i know if it can be done in the following way:
USING CHARACTER SET UTF8
DEFINE JOB export_emea_analytics_eu_deal_flat
(
DEFINE SCHEMA tpt_tab_export_schema FROM SELECT DELIMITED OF OPERATOR tpt_reader1;
DEFINE OPERATOR tpt_reader1
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER EXPORT OPERATOR1'
TYPE EXPORT
SCHEMA tpt_tab_export_schema
ATTRIBUTES
(
VARCHAR USERNAME = 'user'
,VARCHAR SELECTSTMT = 'SELECT * FROM tpt_table1;'
,VARCHAR USERPASSWORD = '********'
,VARCHAR TDPID = 'ip-address'
);
DEFINE SCHEMA tpt_tab_export_schema FROM SELECT DELIMITED OF OPERATOR tpt_reader2;
DEFINE OPERATOR tpt_reader2
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER EXPORT OPERATOR2'
TYPE EXPORT
SCHEMA tpt_tab_export_schema
ATTRIBUTES
(
VARCHAR USERNAME = 'user'
,VARCHAR SELECTSTMT = 'SELECT * FROM tpt_table2;'
,VARCHAR USERPASSWORD = '********'
,VARCHAR TDPID = 'ip-address'
);
DEFINE SCHEMA tpt_tab_export_schema FROM SELECT DELIMITED OF OPERATOR tpt_reader3;
DEFINE OPERATOR tpt_reader3
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER EXPORT OPERATOR3'
TYPE EXPORT
SCHEMA tpt_tab_export_schema
ATTRIBUTES
(
VARCHAR USERNAME = 'user'
,VARCHAR SELECTSTMT = 'SELECT * FROM tpt_table3;'
,VARCHAR USERPASSWORD = '********'
,VARCHAR TDPID = 'ip-address'
);
DEFINE OPERATOR tpt_writer1
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER FLAT FILE WRITER1'
TYPE DATACONNECTOR CONSUMER
SCHEMA *
ATTRIBUTES
(
VARCHAR OPENMODE = 'Write'
,VARCHAR DIRECTORYPATH = '/home/dev/tpt/test/'
,VARCHAR FILENAME = 'tpt_test1.txt'
,VARCHAR INDICATORMODE = 'N'
,VARCHAR QUOTEDDATA = 'N'
,VARCHAR FORMAT = 'DELIMITED'
,VARCHAR TEXTDELIMITER = '|'
,INTEGER BUFFERSIZE = 524288
);
DEFINE OPERATOR tpt_writer2
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER FLAT FILE WRITER2'
TYPE DATACONNECTOR CONSUMER
SCHEMA *
ATTRIBUTES
(
VARCHAR OPENMODE = 'Write'
,VARCHAR DIRECTORYPATH = '/home/dev/tpt/test/'
,VARCHAR FILENAME = 'tpt_test2.txt'
,VARCHAR INDICATORMODE = 'N'
,VARCHAR QUOTEDDATA = 'N'
,VARCHAR FORMAT = 'DELIMITED'
,VARCHAR TEXTDELIMITER = '|'
,INTEGER BUFFERSIZE = 524288
);
DEFINE OPERATOR tpt_writer3
DESCRIPTION 'TERADATA PARALLEL TRANSPORTER FLAT FILE WRITER3'
TYPE DATACONNECTOR CONSUMER
SCHEMA *
ATTRIBUTES
(
VARCHAR OPENMODE = 'Write'
,VARCHAR DIRECTORYPATH = '/home/dev/tpt/test/'
,VARCHAR FILENAME = 'tpt_test3.txt'
,VARCHAR INDICATORMODE = 'N'
,VARCHAR QUOTEDDATA = 'N'
,VARCHAR FORMAT = 'DELIMITED'
,VARCHAR TEXTDELIMITER = '|'
,INTEGER BUFFERSIZE = 524288
);
APPLY TO OPERATOR (tpt_writer1)
SELECT * FROM OPERATOR (tpt_reader1);
APPLY TO OPERATOR (tpt_writer2)
SELECT * FROM OPERATOR (tpt_reader2);
APPLY TO OPERATOR (tpt_writer3)
SELECT * FROM OPERATOR (tpt_reader3);
);
The use of:
APPLY TO OPERATOR (tpt_writer1)
SELECT * FROM OPERATOR (tpt_reader1);
APPLY TO OPERATOR (tpt_writer2)
SELECT * FROM OPERATOR (tpt_reader2);
APPLY TO OPERATOR (tpt_writer3)
SELECT * FROM OPERATOR (tpt_reader3);
is not possible in TPT.
Each use of the APPLY-SELECT must appear in its own job step and each job step would be executed serially.
**bleep**
I have issue when running 2 ore more TPT jobs simultaneously. There is no issue when job is running one at a time.
ex. I have job 1 to extract sales data from Teradata and generate file. 2nd TPT script extracts customer master data from teradata and generate file. If I am running both jobs parallely then 2nd TPT script writes the data into 1st file and 2nd file is not generating.
Note:- I am using export operator
It is very unique issue when 2 TPT scripts (export operator) runs in paralle but there is no issue when it runs at different time.
Request you to share some pointer.
Thanks,
I am also facing the same issue. I am tring to execute two TPT scripts(TPT1,TPT2) simultaneously.
TPT1 ran successfully. But TPT2 wriites the output to TPT1OUT file.
TPT2OUT file not generatred.
TPT1OUT file having both the outputs of TPT1 and TPT2.
Please some one help us how to get rid of this issue.