Teradata JDBC Driver returns the wrong schema / column nullability

Connectivity
Enthusiast

Teradata JDBC Driver returns the wrong schema / column nullability

I'm running an Apache Spark application that uses terajdbc4 to fetch data from a Teradata server.  It seems to be returning the wrong value for the nullability for many of the columns in the table.  I've found two scenarios at work:

1. If I use dbtable = "TABLE_NAME" all columns are returned with nullable = true (despite some not being nullable)

2. If I use dbtable = "(SELECT * FROM TABLE_NAME) as t" all columns are returned with nullable = false (despite some being nullable)

Is this a known issue with either the Teradata driver or the server?

Tags (1)
23 REPLIES
Teradata Employee

Re: Teradata JDBC Driver returns the wrong schema / column nullability

Which JDBC API method does your application call in order to obtain the column nullability information?

If your application is calling the ResultSetMetaData.isNullable method, for that method (and all the other ResultSetMetaData methods), the Teradata JDBC Driver returns information to the application that is provided by the Teradata Database.

You are observing a difference in metadata when referring to a table name directly versus referring to a derived table subselect expression. The Teradata Database does treat expressions differently from direct table/column references with respect to metadata such as nullability.

You can specify the LOG=DEBUG connection parameter and the Teradata JDBC Driver will print to System.out all the details of the information received from the Teradata Database.

Enthusiast

Re: Teradata JDBC Driver returns the wrong schema / column nullability

The call is made in Apache Spark's infrastructure, and it uses ResultSetMetaData.isNullable as you expected:

https://github.com/apache/spark/blob/v2.0.0/sql/core/src/main/scala/org/apache/spark/sql/execution/d...

val nullable = rsmd.isNullable(i + 1) != ResultSetMetaData.columnNoNulls

I'll try LOG=DEBUG to see if I get anything useful.

Teradata Employee

Re: Teradata JDBC Driver returns the wrong schema / column nullability

That logic in Apache Spark is incomplete. The ResultSetMetaData.isNullable method can return 3 possible values:

columnNoNulls indicates that a column does not allow NULL values.

columnNullable indicates that a column allows NULL values.

columnNullableUnknown indicates that the nullability of a column's values is unknown.

As I mentioned earlier, the Teradata Database does treat expressions differently from direct table/column references with respect to metadata such as nullability. In particular, for expressions, the Teradata Database is likely to indicate that the nullability is unknown, which in turn means that the Teradata JDBC Driver will return columnNullableUnknown from the ResultSetMetaData.isNullable method.

The logic in Apache Spark is misleading. The "val nullable" is false when the column is definitely known to disallow NULLs, but the "val nullable" is true when the column permits NULLs and also when the column's nullability is unknown (such as if it's an expression).

Enthusiast

Re: Teradata JDBC Driver returns the wrong schema / column nullability

True, it is reducing a three-value to a boolean by mapping the unknown to nullable = true, and this might explain case #1.

But I can see an actual problem in case #2 where every column was returning nullable = false, which should only happen if ResultSetMetaData.isNullable is returning ResultSetMetaData.columnNoNulls - which just isn't correct (and the column actual contains some null values).

I'll have some time in a few days to put together a simple example with the debug logging to confirm exactly what's happening.

Enthusiast

Re: Teradata JDBC Driver returns the wrong schema / column nullability

I've put together an example outside of the Spark application which has confirmed my suspicion above (using 15.10.00.22):

val driver = new com.teradata.jdbc.TeraDriver()
val url = "jdbc:teradata://..."
val properties = new java.util.Properties()
val conn = driver.connect(url, properties)
val table = "DIM_OFFER" // contains a non-nullable field called OFFER_NBR and a nullable field called OFFER_DESC

val statement = conn.prepareStatement(s"SELECT OFFER_NBR, OFFER_DESC FROM $table WHERE 1=0")
val rs = statement.executeQuery()
val rsmd = rs.getMetaData
rsmd.isNullable(1) // 2 (i.e. ResultSetMetaData.columnNullableUnknown)
rsmd.isNullable(2) // 2

val statement = conn.prepareStatement(s"SELECT OFFER_NBR, OFFER_DESC FROM (SELECT OFFER_NBR, OFFER_DESC FROM $table) as t WHERE 1=0")
val rs = statement.executeQuery()
val rsmd = rs.getMetaData
rsmd.isNullable(1) // 0 (i.e. ResultSetMetaData.columnNoNulls)
rsmd.isNullable(2) // 0

Maybe it's related to sub-queries messing up the nullability?  Is this a known limitation?

Teradata Employee

Re: Teradata JDBC Driver returns the wrong schema / column nullability

>>> Maybe it's related to sub-queries messing up the nullability?  Is this a known limitation?

In short, Yes.

Your subquery is used as a derived table. Here is an excerpt from the Teradata Database design notes for result set metadata:

For a column reference from a set operation (e.g. UNION), derived table, recursive WITH, or non-recursive WITH, the "Base Column name" field will be the Result Column Name, and Database Name and Table Name won’t be provided.  The presence of Database Name and Table Name indicate the presence of “Base Column Name”. When Database Name and Table Name aren’t present (indicating that the SQL statement caused the base table information to be either ambiguous or “lost”), then Base Column Name is simply Result Column Name. 

This design note indicates that the Teradata Database cannot provide accurate metadata for the schema name and table name (rsmd.getSchemaName and rsmd.getTableName) for a derived table's result set columns. The design note doesn't metion nullability, but there appears to be a similar limitation.

I did the following test...

create volatile table tomtest(c1 integer, c2 integer not null) on commit preserve rows

followed by

select c1, c2 from tomtest

and also

select c1, c2 from (select c1, c2 from tomtest) as dt

I enabled the Teradata JDBC Driver's debug logging with the LOG=DEBUG connection parameter. The first query, with direct table column references, produces the following debug log output. The Teradata Database indicates column c1 isNullable=Y and column c2 isNullable=N, both as expected per their column definitions in the create table command. The Teradata JDBC Driver returns rsmd.isNullable(1) = 1 (columnNullable) and rsmd.isNullable(2) = 0 (columnNoNulls), both as expected.

*** StatementInfoParcel ***

*** Begin List with 0 items ***

*** End of List ***

*** Begin List with 2 items ***

com.teradata.jdbc.jdbc_4.parcel.FullContentMetadataItem
-- Header --
itemLayoutType  : 1
itemContentType : 2
itemLength      : 91
receiveItemNumber        : 1
transmitItemNumber       : 1
-- Body --
dataType                 : 497
maxDataLengthInBytes     : 4
totalNumberOfDigits      : 10
numberOfIntervalDigits   : 0
numberOfFractionalDigits : 0
databaseName             : guest
procedureOrTableName     : tomtest
columnOrParameterName    : c1
columnPositionInTable    : 1
asClauseName             :
title                    : c1
format                   : -(10)9
defaultValue             :
isIdentityColumn         : N
isDefinitelyWritable     : Y
isNullable               : Y
mayBeNull                : Y
isSearchable             : Y
isWritable               : Y
udtIndicator             : 0
udtTypeName              :
dataTypeMiscInfo         :
charsetCode              : 0
maxNumberOfCharacters    : 0
isCaseSensitive          : U
isSigned                 : Y
isKeyColumn              : N
isUnique                 : N
isExpression             : N
isSortable               : Y
SPParameterDirection     : U
structDepth              : 0
isTemporal               : 0
attributeName            :
serverDataType           : 0
arrayNumberOfDimensions  : 0
arrayMaxCardinalities    :
arrayLowerBounds         :

com.teradata.jdbc.jdbc_4.parcel.FullContentMetadataItem
-- Header --
itemLayoutType  : 1
itemContentType : 2
itemLength      : 91
receiveItemNumber        : 2
transmitItemNumber       : 2
-- Body --
dataType                 : 496
maxDataLengthInBytes     : 4
totalNumberOfDigits      : 10
numberOfIntervalDigits   : 0
numberOfFractionalDigits : 0
databaseName             : guest
procedureOrTableName     : tomtest
columnOrParameterName    : c2
columnPositionInTable    : 2
asClauseName             :
title                    : c2
format                   : -(10)9
defaultValue             :
isIdentityColumn         : N
isDefinitelyWritable     : Y
isNullable               : N
mayBeNull                : N
isSearchable             : Y
isWritable               : Y
udtIndicator             : 0
udtTypeName              :
dataTypeMiscInfo         :
charsetCode              : 0
maxNumberOfCharacters    : 0
isCaseSensitive          : U
isSigned                 : Y
isKeyColumn              : N
isUnique                 : N
isExpression             : N
isSortable               : Y
SPParameterDirection     : U
structDepth              : 0
isTemporal               : 0
attributeName            :
serverDataType           : 0
arrayNumberOfDimensions  : 0
arrayMaxCardinalities    :
arrayLowerBounds         :

The second query, with a subquery and derived table column references, produces the following debug log output. In this situation, the Teradata Database indicates column c1 isNullable=N and column c2 isNullable=N. The nullability is not accurate for column c2. The Teradata JDBC Driver returns rsmd.isNullable(1) = 0 (columnNoNulls) and rsmd.isNullable(2) = 0 (columnNoNulls). While the nullability is not accurate for column c2, the Teradata JDBC Driver is dutifully reporting the value it obtained from the Teradata Database.

*** StatementInfoParcel ***

*** Begin List with 0 items ***

*** End of List ***

*** Begin List with 2 items ***

com.teradata.jdbc.jdbc_4.parcel.FullContentMetadataItem
-- Header --
itemLayoutType  : 1
itemContentType : 2
itemLength      : 79
receiveItemNumber        : 1
transmitItemNumber       : 1
-- Body --
dataType                 : 497
maxDataLengthInBytes     : 4
totalNumberOfDigits      : 10
numberOfIntervalDigits   : 0
numberOfFractionalDigits : 0
databaseName             :
procedureOrTableName     :
columnOrParameterName    : c1
columnPositionInTable    : 0
asClauseName             :
title                    : c1
format                   : -(10)9
defaultValue             :
isIdentityColumn         : N
isDefinitelyWritable     : N
isNullable               : N
mayBeNull                : Y
isSearchable             : Y
isWritable               : N
udtIndicator             : 0
udtTypeName              :
dataTypeMiscInfo         :
charsetCode              : 0
maxNumberOfCharacters    : 0
isCaseSensitive          : U
isSigned                 : Y
isKeyColumn              : N
isUnique                 : N
isExpression             : N
isSortable               : Y
SPParameterDirection     : U
structDepth              : 0
isTemporal               : 0
attributeName            :
serverDataType           : 0
arrayNumberOfDimensions  : 0
arrayMaxCardinalities    :
arrayLowerBounds         :

com.teradata.jdbc.jdbc_4.parcel.FullContentMetadataItem
-- Header --
itemLayoutType  : 1
itemContentType : 2
itemLength      : 79
receiveItemNumber        : 2
transmitItemNumber       : 2
-- Body --
dataType                 : 496
maxDataLengthInBytes     : 4
totalNumberOfDigits      : 10
numberOfIntervalDigits   : 0
numberOfFractionalDigits : 0
databaseName             :
procedureOrTableName     :
columnOrParameterName    : c2
columnPositionInTable    : 0
asClauseName             :
title                    : c2
format                   : -(10)9
defaultValue             :
isIdentityColumn         : N
isDefinitelyWritable     : N
isNullable               : N
mayBeNull                : N
isSearchable             : Y
isWritable               : N
udtIndicator             : 0
udtTypeName              :
dataTypeMiscInfo         :
charsetCode              : 0
maxNumberOfCharacters    : 0
isCaseSensitive          : U
isSigned                 : Y
isKeyColumn              : N
isUnique                 : N
isExpression             : N
isSortable               : Y
SPParameterDirection     : U
structDepth              : 0
isTemporal               : 0
attributeName            :
serverDataType           : 0
arrayNumberOfDimensions  : 0
arrayMaxCardinalities    :
arrayLowerBounds         :

The design note says that the database name and table name are not provided for a subquery and derived table column references. We see that behavior in the debug log output examples above. The debug log shows that the "databaseName" and "procedureOrTableName" field values are blank. This corresponds to an empty zero-length string returned from the rsmd.getSchemaName and rsmd.getTableName methods.

To summarize, this is a limitation of the Teradata Database. Your application must be aware that when the rsmd.getSchemaName and rsmd.getTableName methods return an empty zero-length string, then the other metadata values may not be completely accurate.

Teradata Employee

Re: Teradata JDBC Driver returns the wrong schema / column nullability

Correction: In my post above, where I said that the nullability for column c2 is inaccurate -- it's actually the nullability for column c1 that is inaccurate.

Column c1 doesn't have a NOT NULL clause, so column c1 is nullable, therefore, the Teradata Database should indicate that column c1 isNullable=Y, but instead the Teradata Database indicates that column c1 isNullable=N, which is not correct.

Enthusiast

Re: Teradata JDBC Driver returns the wrong schema / column nullability

I really appreciate the effort you've put in.  Looks like this is defintely an a issue in the database server itself (not the driver/connectivity).

Enthusiast

Re: Teradata JDBC Driver returns the wrong schema / column nullability

Out of interest, what is the logic behind the property "mayBeNull" on FullContentMetadataItem?  I'm noticing that it differs from the isNullable property in the second query for the first column:

isNullable               : N

mayBeNull                : Y