Error when using DEFINE SCHEMA FROM SELECT

Tools
Tools covers the tools and utilities you use to work with Teradata and its supporting ecosystem. You'll find information on everything from the Teradata Eclipse plug-in to load/extract tools.
Highlighted
Enthusiast

Error when using DEFINE SCHEMA FROM SELECT

Hello:

 

 

I'm not sure if I should have bumped the old thread or started a new one. Here is the old thread... https://community.teradata.com/t5/Tools/Duplicate-Select-statements-in-history-using-ODBC-Operator/t...

 

What I believe is happening (and I could be wrong so I'd like to try to test it) is that DEFINE SCHEMA with columns is causing a full SELECT on the Netezza source table. We're doing something like this currently...

 

	DEFINE SCHEMA SourceTable
	(
	SUBSCRIBER_NO INTEGER,
	STREET VARCHAR(30),
	CITY VARCHAR(20),
	"STATE" CHAR(2),
	ZIP INTEGER,
	AREACODE SMALLINT,
	PHONE INTEGER
	);

 

I'd like to use the DEFINE SCHEMA FROM SELECT OF OPERATOR or... DEFINE SCHEMA FROM SELECT 'sqlSelectStatement' to test this, but I'm running into errors.

 

If I use DEFINE SCHEMA FROM SELECT OF OPERATOR I receive the following (including the operator I'm referencing):

(I'm also assuming I'd have to define two ODBC operators?....One for the schema and one for the SELECT.)

DEFINE JOB "Load To Teradata"
(
	DEFINE SCHEMA SourceTable FROM SELECT OF OPERATOR ODBCOperator;

	DEFINE OPERATOR ODBCOperator
	TYPE ODBC
	SCHEMA SourceTable
	ATTRIBUTES (
			VARCHAR SelectStmt = 'SELECT SUBSCRIBER_NO ,STREET ,CITY ,STATE ,ZIP ,AREACODE ,PHONE FROM SQL_CLASS."ADMIN".ADDRESSES WHERE 1=0;'
			,VARCHAR DSNName = 'Netezza'
			,VARCHAR UserName = 'user'
			,VARCHAR UserPassword = 'password'
	);

............

Teradata Parallel Transporter Version 16.20.00.01 64-Bit
TPT_INFRA: TPT04146: Error: Operator 'ODBCOperator', the referenced
source of the SELECT statement schema basis for defined schema 'SourceTable',
is not defined or referenced in the job script.
Job script preprocessing failed.

Job terminated with status 8.

 

If I use DEFINE SCHEMA FROM SELECT 'sqlSelectStatement' I receive the following:

 

DEFINE SCHEMA SourceTable FROM SELECT 'SELECT SUBSCRIBER_NO ,STREET ,CITY ,STATE ,ZIP ,AREACODE ,PHONE FROM SQL_CLASS."ADMIN".ADDRESSES WHERE 1=0;';


Teradata Parallel Transporter Version 16.20.00.01 64-Bit
TPT_INFRA: TPT05014: RDBMS error 3706: Syntax error: expected something between
the word 'ADMIN' and '.'.
TPT_INFRA: TPT04130: Error: Schema generation failed for SELECT statement
'SELECT SUBSCRIBER_NO ,STREET ,CITY ,STATE ,ZIP ,AREACODE ,PHONE FROM SQL_CLASS.
"ADMIN".ADDRESSES WHERE 1=0;'
in DBS 'td1510': GetSelectStmtSchema status: 48.

Job script preprocessing failed.

Job terminated with status 12.

 

I'd prefer to use the second method. Could someone help me to understand what I am doing wrong?

 

Here is a link to the full script:  define_schema_issue.zip

 

 

 

Tags (2)
4 REPLIES
Teradata Employee

Re: Error when using DEFINE SCHEMA FROM SELECT

DEFINE SCHEMA FROM SELECT always attempts to obtain the metadata from the Teradata side of the step (Producer, if both Producer and Consumer are Teradata).

 

You might consider developing a script or application that queries NZ metadata and builds the corresponding explicit DEFINE SCHEMA statement in a text file, then using the INCLUDE directive within the TPT script to use the pre-built definition.

Enthusiast

Re: Error when using DEFINE SCHEMA FROM SELECT

Thank you for your reply Fred.

 

So we have a process in place when generating scripts to create the DEFINE SCHEMA statement. It was all looking good for Netezza except.

 

The problem that we are experiencing is that two SELECT statement are being run on the source table during data movement. This additional SELECT causes a lot of additional overhead.

 

We've seen another tool which replicates data using TPT and when we look at the Netezza logs we see two SELECTs as well, however, the first is a SELECT with a LIMIT 0 condition. It was my assumption that they were creating the schema using a select statement with a LIMIT 0 condition. Of course since this only works for Teradata I don't see how that would be the case now, but I also don't know how they would be able to generate this first SELECT via the script with a LIMIT 0 condition.

 

Here's a link to the private log if that may help determine where this may be happening. I can't tell based on the log, but something may stick out to you. 

 

Private log

 

And here are the two selects happening under the same sessionid.

nz-query-history.JPG

 

 

Teradata Employee

Re: Error when using DEFINE SCHEMA FROM SELECT

Other tools that use TPTAPI for loading to Teradata will use a different connector (not TPT ODBC Operator) to fetch the data and metadata. There might also be third-party NZ ODBC drivers that behave differently - though those would not be officially supported either.

 

Looking at an operator trace or ODBC trace might show which features it's complaining about being "not implemented". But I wouldn't expect the TPT developers to make a change unless there was some really generic API-based solution (i.e. no NZ-specific SQL workarounds); nor does it seem likely IBM would be willing to enhance the driver just for TPT.

Enthusiast

Re: Error when using DEFINE SCHEMA FROM SELECT

Thank you Fred. I'll have to double-check, but it might be the difference between the Netezza ODBC driver for Windows or Linux.

 

On thing we can confirm is even when we were able to "trick" the DEFINE SCHEMA FROM SELECT statement to use the target table with a WHERE condition of 1=0 to define the source table schema we still saw two SELECT statement generated in Netezza history after sucessful completion of the script. There is something else going on and perhaps switching to the Netezza Linux ODBC driver will allow us to run the process without two SELECT statements generated.