TPT Load Operator - MultipleInstances for single file

Tools & Utilities
Enthusiast

TPT Load Operator - MultipleInstances for single file

Hi,

 

I have one large file and trying to optimize by using multiplereaders option.

 

It is always picking 1 instance irresctive of number instances specified .

 

Below is the script - 

 

DEFINE OPERATOR TGT_DEVICE_SOURCE
TYPE DATACONNECTOR PRODUCER
SCHEMA TGT_DEVICE_SCHEMA
ATTRIBUTES
(
VARCHAR FileName = '/home/sb185065/l102.dat',
VARCHAR MultipleReaders = 'Yes',
VARCHAR OpenMode = 'Read',
VARCHAR Format = 'DELIMITED',
VARCHAR NullColumns = 'N',
INTEGER ErrorLimit = 1,
VARCHAR TextDelimiter = '|'
,INTEGER SkipRows = 0
);

APPLY
('
INSERT INTO INDIA1.DEVICE1 ( WRLS_DEV_SKU_NBR,EFF_DT ) VALUES ( :WRLS_DEV_SKU_NBR,:EFF_DT (DATE,FORMAT ''YYYY-MM-DD'') )
')
TO OPERATOR (TGT_DEVICE_LOAD )
SELECT case when WRLS_DEV_SKU_NBR = 'NULL' then NULL else WRLS_DEV_SKU_NBR end as WRLS_DEV_SKU_NBR,case when EFF_DT = 'NULL' then NULL else EFF_DT end as EFF_DT FROM OPERATOR (TGT_DEVICE_SOURCE[2] );  ===> 2 instances.

 

 

================

 

Below is the tlogview 

 

TGT_DEVICE_SOURCE[1]: DataConnector Producer operator Instances: 2
TGT_DEVICE_SOURCE[1]: ECI operator ID: 'TGT_DEVICE_SOURCE-20897'
TGT_DEVICE_SOURCE[2]: TPT19012 No files assigned to instance 2. This instance will be inactive.  ---> Ignored
TGT_DEVICE_SOURCE[1]: Operator instance 1 processing file '/home/sb185065/l102.dat'.

 

 

What I am doing wrong here ?

 

Thanks,

SB


Accepted Solutions
Teradata Employee

Re: TPT Load Operator - MultipleInstances for single file

If you are using MultipleReaders, then the number of instances for the DataConnector operator must be more than 2.

 

The way MultipleReaders works is:

  • the master instance (instance #1) reads a block of data (usually around 2MB) into a shared memory buffer
  • when the read is complete, the slaves will read the data out of the shared memory
  • when the slaves are processing the data from the shared memory buffer, the master reads another block of data into another shared memory buffer (buffer #2)
  • when the slave have completed processing of the data, the master and slaves swap buffers

If you only specify 2 instances, then you will have 1 instance reading data and 1 instance processing the data, but you will not have any parallelism going on.

Try specify a value of 5 or 6 for the instance count for the DataConnector operator.

This will have 1 master and 4-5 slaves processing the data in parallel.

 

-- SteveF
1 ACCEPTED SOLUTION
19 REPLIES
Enthusiast

Re: TPT Load Operator - MultipleInstances for single file

Did any one get a change to look into this ?

Enthusiast

Re: TPT Load Operator - MultipleInstances for single file

Admin,

 

Can you help in addressing this issue .

 

Thanks,

Sateesh

Teradata Employee

Re: TPT Load Operator - MultipleInstances for single file

What version of TPT, on which client platform?

Enthusiast

Re: TPT Load Operator - MultipleInstances for single file

Teradata Parallel Transporter Load Operator Version 15.10.00.00

Client Platform - Linux

Enthusiast

Re: TPT Load Operator - MultipleInstances for single file

Fred,

 

 

How to use MultipleReaders and Writers using single file using TPT Load Operator.

 

In my case, I am getting error.

 

Can you look into this on urgent need.

 

 

Thanks,

 

Enthusiast

Re: TPT Load Operator - MultipleInstances for single file

Is this known issue ? Can some one please respond.

 

Admin,

 

Need your help again!

 

Thanks,

SB

Teradata Employee

Re: TPT Load Operator - MultipleInstances for single file

If you are using MultipleReaders, then the number of instances for the DataConnector operator must be more than 2.

 

The way MultipleReaders works is:

  • the master instance (instance #1) reads a block of data (usually around 2MB) into a shared memory buffer
  • when the read is complete, the slaves will read the data out of the shared memory
  • when the slaves are processing the data from the shared memory buffer, the master reads another block of data into another shared memory buffer (buffer #2)
  • when the slave have completed processing of the data, the master and slaves swap buffers

If you only specify 2 instances, then you will have 1 instance reading data and 1 instance processing the data, but you will not have any parallelism going on.

Try specify a value of 5 or 6 for the instance count for the DataConnector operator.

This will have 1 master and 4-5 slaves processing the data in parallel.

 

-- SteveF
Enthusiast

Re: TPT Load Operator - MultipleInstances for single file

Thanks Fred. It's worked . But we are saving only 25% in time.

 

At the same time we see below messages in the log .

 

Teradata Parallel Transporter DataConnector Operator Version 15.10.00.02D2D.22411.3
TGT_DEVICE_SOURCE[5]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.
TGT_DEVICE_SOURCE[2]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.
TGT_DEVICE_SOURCE[1]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.
TGT_DEVICE_SOURCE[3]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.
TGT_DEVICE_SOURCE[4]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.
TGT_DEVICE_SOURCE[6]: TPT19206 Attribute 'TraceLevel' value reset to 'Statistics Only'.
TGT_DEVICE_SOURCE[5]: Instance 5 directing private log report to 'dtacop-sb185065-59042-5'.
TGT_DEVICE_SOURCE[2]: Instance 2 directing private log report to 'dtacop-sb185065-59032-2'.
TGT_DEVICE_SOURCE[1]: Instance 1 directing private log report to 'dtacop-sb185065-59031-1'.
TGT_DEVICE_SOURCE[3]: Instance 3 directing private log report to 'dtacop-sb185065-59040-3'.
TGT_DEVICE_SOURCE[4]: Instance 4 directing private log report to 'dtacop-sb185065-59041-4'.
TGT_DEVICE_SOURCE[6]: Instance 6 directing private log report to 'dtacop-sb185065-59044-6'.
TGT_DEVICE_SOURCE[1]: DataConnector Producer operator Instances: 6
TGT_DEVICE_SOURCE[5]: TPT19012 No files assigned to instance 5. This instance will be inactive.
TGT_DEVICE_SOURCE[2]: TPT19012 No files assigned to instance 2. This instance will be inactive.
TGT_DEVICE_SOURCE[3]: TPT19012 No files assigned to instance 3. This instance will be inactive.
TGT_DEVICE_SOURCE[1]: ECI operator ID: 'TGT_DEVICE_SOURCE-59031'
TGT_DEVICE_SOURCE[4]: TPT19012 No files assigned to instance 4. This instance will be inactive.
TGT_DEVICE_SOURCE[6]: TPT19012 No files assigned to instance 6. This instance will be inactive.
TGT_DEVICE_SOURCE[1]: Operator instance 1 processing file '/home/sb185065/l102.dat'.

 

Is this really working ?

 

How can  we maximize  the time saving ?

 

We can't change the TASM setting for number sessions for load operator.

 

What are the options available to minimize the processing time ? ( sigle file vs multiple files ) which one is preferred ?

 

 

This is something that we need on an urgent basis.

 

Thanks,

SB

 

Teradata Employee

Re: TPT Load Operator - MultipleInstances for single file

Please provide your script (and job variable file, if you are using one).

 

-- SteveF