We utilize an ETL tool from a vendor I wish to remain anonymous (starts with an S and ends with a P) :) This particular vendor has chosen to execute TPTLOAD via tbuild instead of the TPTAPI. There is no option in the GUI to execute via the TPTAPI. We are in the process of upgrading this tool. We are starting to experience random named pipe failures that may be related to checkpointing. When a TPTLOAD job fails, it creates a checkpoint file in the checkpoint directory on the server. This is created in the event the user wants to restart the job from the failure point. This is all very logical and straight forward in my mind. Where things start to fall apart is when a "named pipes" TPTLOAD is executed via tbuild. By definition a named pipe job cannot be restarted via a checkpoint file. This feature is reserved for physical files being read from disk. If a TPTLOAD "named pipes" job fails, why would the Data Connector look for a checkpoint file in the checkpoint directory on the server and then fail if it is not found. If the checkpoint file were found it could not be used as part of a "named pipes" restart process anyway...correct?
If a TPTLOAD "named pipes" job fails, we normally just clean up the error tables and restart the job from the beginning. We can do this because we do not have very tight SLAs or very large amounts of data. Sometimes (not sure what triggers this), when we try to restart, the job fails again because it found a checkpoint file associated with a prior run failure. A couple of specific questions that I have are:
1. Are checkpoint files always created for TPTLOAD "named pipe" jobs even though they will never be used? If so, why?
2. When a TPTLOAD "named pipe" job fails, are the checkpoint files always created and left in checkpoint files directory on the server? If so, do these files always need to be deleted before the "named pipe" job can be re-submitted?
3. Any idea why a GUI generated tbuild "named pipes" job would fail, be re-submitted, and then fail again because a checkpoint file exists?
I am really trying to understand the relationship between "named pipe" jobs and checkpoint files since they seem to be mutually exclusive.
I am not quite sure what you mean by TPTLOAD "named pipe" jobs.
I am assuming you mean that the job is sending the data through named pipes to the Load operator.
Since you also mentioned the Data Connector operator, then the job is doing one of 2 things. It is either trying to open/manage named pipes natively, or you must be using the Named Pipe Access Module.
If you are not using the Named Pipe Access Module, then you should. We do not recommend using native named pipes with TPT. Very hard to manage and they do not support restarts.
The NPAM supports restarts because it does create its own checkpoint (also called fallback) file to store data that it sends to the DC operator.
In any event, the DC operator will always create a checkpoint file because it has information of its own to keep track of while a job is running. the user should not concern itself about any temp files created under the covers.
As always thanks Steve! We had a contracting firm set up the Data Services configuration and they chose generic named pipes over NPAM for some reason. I will try to get our ETL team to switch it over.