I'm looking to improve the performance of a TPT stream job which loads about 50 million rows (10 GB) into a table , taking 6+ hours.
Until last week, this same job completed within 3 hours with 1 instance and 16 sessions (Maxsessions=16).
However, to resolve a recurring issue(integration listener failure) on the Informatica side, we implemented Informatica's suggestion to match the TD instances with number of target Informatica partitions.
So now, the number of TD instances = 6, and Maxsessions = 42.
This change has impacted the performance of TPT load severely, which I understand is due to each instance getting less sessions(7) and all instances are not being used. I know this is default TD behaviour to not use more instances than required.
Any suggestion on how to force minimum number of sessions for this TPT job ?
I tried the Minsessions attribute(setting it to 16), however , this does not work. The average number of sessions this job uses is between 5 and 7.
We cannot reduce the instances, as the original issue is resolved with 6 instances.
This large volume of data is received in small chunks continuously and not all of it is available instantly for consumption.
Hence, this is created as a trickle feed and load using STREAM operator.
Also, the usage of stream operator is enforced by Informatica.
We receive data from SAP source system into Teradata using Informatica BCI Listener, which is like a active service between SAP and Teradata.
As per your explanation, there are many factors that can impact the performance. Difficult to say without information. What kind of reader are you using to pass the data to STREAM? Is it instanciated too?
MinSessions/MaxSessions are for the job (across all instances of the operator), not per instance. MinSessions just fails the job if that number of sessions are not available - so that's not what you want. If you have 6 instances and want 16 sessions each, you'd have to set MaxSessions to 96 not 42.
But why are you convinced that the number of sessions per instance is the issue? If you are running TPT stand-alone then the default behavior of TPT infrastructure is to send data to as few consumer instances as needed to keep up with the producer instances. But if you are using TPTAPI with Informatica, then it depends on Informatica's partitioning. As Carlos says, you may need to provide more info for someone to help you.