Teradata JDBC Driver returns the wrong schema / column nullability

Connectivity
Enthusiast

Re: Teradata JDBC Driver 16.10 returns the wrong schema / column nullability

Here is the Spark code:

val ss: SparkSession = SparkSession.builder()
      .master("local")
      .appName("test")
      .enableHiveSupport()
      .getOrCreate()

    val jdbcDriver = "com.teradata.jdbc.TeraDriver"
    Class.forName(jdbcDriver)
    val jdbcReader: DataFrameReader = ss.read.option("driver", jdbcDriver)

    val connectionUrl: String = "jdbc:teradata://tddev.company.com/DBS_PORT=1025,DATABASE=DBC,CHARSET=UTF16,MAYBENULL=ON"
    val properties: Properties = new Properties()
    properties.put("user", "ETL_USER")
    properties.put("password", "***")

    val table: DataFrame = jdbcReader.jdbc(connectionUrl, "(select * from dbc.tables) as T", properties)
    table.printSchema()

Output of "table.printSchema"

root
 |-- DatabaseName: string (nullable = false)
 |-- TableName: string (nullable = false)
 |-- Version: integer (nullable = false)
 |-- TableKind: string (nullable = false)
 |-- ProtectionType: string (nullable = false)
 |-- JournalFlag: string (nullable = false)
 |-- CreatorName: string (nullable = true)
 |-- RequestText: string (nullable = true)
 |-- CommentString: string (nullable = true)
 |-- ParentCount: integer (nullable = false)
 |-- ChildCount: integer (nullable = false)
 |-- NamedTblCheckCount: integer (nullable = false)
 |-- UnnamedTblCheckExist: string (nullable = false)
 |-- PrimaryKeyIndexId: integer (nullable = true)
 |-- RepStatus: string (nullable = true)
 |-- CreateTimeStamp: timestamp (nullable = true)
 |-- LastAlterName: string (nullable = false)
 |-- LastAlterTimeStamp: timestamp (nullable = true)
 |-- RequestTxtOverflow: string (nullable = true)
 |-- AccessCount: long (nullable = true)
 |-- LastAccessTimeStamp: timestamp (nullable = true)
 |-- UtilVersion: integer (nullable = true)
 |-- QueueFlag: string (nullable = false)
 |-- CommitOpt: string (nullable = false)
 |-- TransLog: string (nullable = false)
 |-- CheckOpt: string (nullable = false)
 |-- TemporalProperty: string (nullable = true)
 |-- ResolvedCurrent_Date: date (nullable = true)
 |-- ResolvedCurrent_Timestamp: timestamp (nullable = true)
 |-- SystemDefinedJI: string (nullable = true)
 |-- VTQualifier: string (nullable = true)
 |-- TTQualifier: string (nullable = true)

TD_DBC_Tables_Schema_in_Spark_via_JDBC1610.png

Teradata Employee

Re: Teradata JDBC Driver 16.10 returns the wrong schema / column nullability

You are executing the query select * from dbc.tables

 

The query result set's column values have no relationship whatsoever to the Teradata JDBC Driver's MAYBENULL connection parameter.

 

If you believe that the results from the query select * from dbc.tables are incorrect, this has nothing to do with the Teradata JDBC Driver. This is an issue with the Teradata Database, and you should open an incident with Teradata Customer Service.

Enthusiast

Re: Teradata JDBC Driver 16.10 returns the wrong schema / column nullability

If I specify each column names, then it will be fine?

SELECT DatabaseName, TableName, CreatorName, AlterName FROM DBC.Tables?

The query result is fine. But JDBC driver does not report the correct nullability to Spark. Particularly, for a column which can contain NULL, we observe that Teradata JDBC reports it as NOT NULLABLE. Once Spark iterates the records, it will fail. Ideally, if Teradata JDBC Driver can blindly report all columns in the result as NULLABLE, that will be at least safe.

 

This NULLABILITY check is mandatory in Spark 2.+ 

Teradata Employee

Re: Teradata JDBC Driver 16.10 returns the wrong schema / column nullability


ericsun2 wrote:

if Teradata JDBC Driver can blindly report all columns in the result as NULLABLE 


No, the Teradata JDBC Driver does not offer a feature to blindly report all columns in a result set as nullable, and it is unlikely that we would ever offer such a feature.

 

The job of the Teradata JDBC Driver is to accurately report to the application the result set metadata that it receives from the Teradata Database.

 

If you believe that the Teradata Database is providing incorrect result set metadata, then you should open an incident with Teradata Customer Service.

Enthusiast

Re: Teradata JDBC Driver 16.10 returns the wrong schema / column nullability

 Before we apply MAYBENULL=ON with the latest 16.10 JDBC driver, the described failure is 100%. So we believe the latest JDBC driver is definitely better than before, but it may still have glitch (probably for CHAR(n) columns).

 

The support ticket is RECHEEECT

 

The sample Scala code can reproduce the sympton easily. Can you please help?  

Enthusiast

Re: Teradata JDBC Driver 16.10 returns the wrong schema / column nullability

Hi @tomnolan

 

Happy New Year. Did you get a chance to review the support ticket?

 

Thank you.

Teradata Employee

Re: Teradata JDBC Driver 16.10 returns the wrong schema / column nullability

Hi @ericsun2,

 

Yes, I have read the incident. Based on the information provided, the Teradata Database and the Teradata JDBC Driver are operating as intended.

 

The incident raised a concern about a behavior difference between the HELP COLUMN command output versus the metadata parcels composed by the Teradata Database. At the present time, there is a behavior difference between those features.

 

A Teradata Database RFC was created (DR 188540) requesting that the HELP COLUMN command output be changed to match the information contained in the metadata parcels. I do not have any information regarding whether or when that Teradata Database RFC might be worked on.

 

Regards,

Tom