As was previously answered, just indicating 100 million rows does not tell us the entire story.
How large are the rows?
What is your network bandwidth?
What is your CPU availablity.
Since you are pulling data from Oracle, the FastLoad loader will outperform the extract from Oracle.
In most cases (no matter how many nodes you have) I have not seen much need for more than 32-40 sessions.
However, it depends on your row size.
As for checkpointing, the rule of thumb is this: "how much time are you willing to lose?"
For example, if you do not want to lose more than 15 minutes worth of data loading, then set checkpoint to an appropriate value for the utility that would correspond to 15 minutes. (MultiLoad can accept a time value for checkpoint; FastLoad is row-based, so you will have to determine how many rows you can load in 15 minutes.)
Tenacity/sleep is irrelevant unless you expect to run your job at a time when a lot of other FastLoad/MultiLoad jobs are being run at the same time and you may get locked out due to the load limit. The amount of time to wait for the next logon attempt it up to you.
You can also consult with Informatica.
They have used our tools for a long time and may be able to help here.
We implemented this new architecture in our system :
Powercenter Realtime module received messages from MQ and each message is load in a TERADATA table using TPT STREAM operator.
My question concerns the right number of Session to set. We have 2 parameters : Min sessions and Max sessions.
Min sessions =1
Max sessions = ?
We think that Max sessions=1 is enough because we load (TPT STREAM) only one short message at a time.
If you are only loading 1 message at a time (and I am assuming that 1 message means 1 row), then 1 session is fine, and you should set the pack factor to 1 so that the row is sent immediately (and not waiting for a buffer of rows to fill up).