Load Nulls with R & JDBC Fastload Problem

Connectivity
Supporter

Load Nulls with R & JDBC Fastload Problem

Hi all,

I try to load some data from R directly to Teradata 14.10 via JDBC Fastload.

The data contains some Nulls for numeric values and I have not been able to load this data so far. 

The code below is an example.

Three columns are generated. One integer, one char and one integer again.

The last column will have about 25% nulls.

If you comment out line 48 no nulls will be generated and you can see that the load per se is working.

From what I read so far I understand that nulls need to be set differently and I tried to do that within the msinsert function but obvously without success.

Is this a bug? I saw some other posts on JDBC and Nulls but I am not sure that the issue is the same.

Any support is very apprichiated.

Ulrich

library(RJDBC)
################
#def functions
################
myinsert <- function(arg1,arg2,arg3){
.jcall(ps,"V","setInt",as.integer(1),as.integer(arg1))
.jcall(ps,"V","setString",as.integer(2),arg2)
if (is.na(arg3)==TRUE) {
.jcall(ps,"V","setNull",as.integer(3),Types.INTEGER)
} else {
.jcall(ps,"V","setInt",as.integer(3),as.integer(arg3))
}
.jcall(ps,"V","addBatch")
}

MHmakeRandomString <- function(n=1, lenght=12)
{
randomString <- c(1:n) # initialize vector
for (i in 1:n)
{
randomString[i] <- paste(sample(c(0:9, letters, LETTERS),
lenght, replace=TRUE),
collapse="")
}
return(randomString)
}

################
#DB Connect
################
.jaddClassPath("/MyPath/TeraJDBC__indep_indep.14.10.00.17/terajdbc4.jar")
.jaddClassPath("/MyPath/TeraJDBC__indep_indep.14.10.00.17/tdgssconfig.jar")
drv = JDBC("com.teradata.jdbc.TeraDriver","/MyPath/TeraJDBC__indep_indep.14.10.00.17/tdgssconfig.jar","/MyPath/TeraJDBC__indep_indep.14.10.00.17/terajdbc4.jar")
conn = dbConnect(drv,"jdbc:teradata://neo/CHARSET=UTF8,LOG=ERROR,DBS_PORT=1025,TYPE=FASTLOAD,TMODE=TERA,SESSIONS=1","uli","m00rhuhn")

################
#main
################

##gen test data
dim = 10000
i = 1:dim
s = MHmakeRandomString(dim,12)
j = sample(1:10000, dim)
i1 <- j %% 4 == 0
#assign some NA
j[i1] <- NaN

## set up table
dbSendUpdate(conn,"drop table foo;")
dbSendUpdate(conn,"create table foo (a int, b varchar(100),c int);")

#set autocommit false
.jcall(conn@jc,"V","setAutoCommit",FALSE)
##prepare
ps = .jcall(conn@jc,"Ljava/sql/PreparedStatement;","prepareStatement","insert into foo values(?,?,?)")

#start time
ptm <- proc.time()

## batch insert
for(n in 1:dim){
myinsert(i[[n]],s[[n]],j[[n]])
}
#run time
proc.time() - ptm

#apply & commit
.jcall(ps,"[I","executeBatch")
dbCommit(conn)
.jcall(ps,"V","close")
.jcall(conn@jc,"V","setAutoCommit",TRUE)

#get some sample results
dbGetQuery(conn,"select top 100 * from foo")
dbGetQuery(conn,"select count(*) from foo")

#disconnect
dbDisconnect(conn)
Tags (1)
9 REPLIES
Teradata Employee

Re: Load Nulls with R & JDBC Fastload Problem

The "myinsert" function tests for NA with this line:

if (is.na(arg3)==TRUE) {

The is.na test will only return TRUE for NA, and will return false for NaN.

On line 48 you assign NaN (which is floating-point "Not a Number") to the array item:

j[i1] <- NaN

But I suspect that you actually want to assign NA (the R equivalent to null) to the array item:

j[i1] <- NA

Supporter

Re: Load Nulls with R & JDBC Fastload Problem

Hi, 

no thats not the issue. 

is.na(x) checks for NA and NaN. 

You can change j[i1] <- NA and the error is the same.

The message is

"Fehler in .jcall(ps, "V", "setNull", as.integer(3), Types.INTEGER) : 

  Objekt 'Types.INTEGER' nicht gefunden"

which would be "Error in ... Objekt 'Types.INTEGER' no found"

So the if had been evaluated as it should...

Ulrich

Teradata Employee

Re: Load Nulls with R & JDBC Fastload Problem

Well, I'm not an R programmer, so sorry that the j[i1] <- NA didn't fix the problem for you.

Types.INTEGER is a Java constant value equal to 4. Perhaps R does not provide access to the Types.XXX constants? Try specifying 4 instead of Types.INTEGER.

Supporter

Re: Load Nulls with R & JDBC Fastload Problem

Unfortunatly it does not help.

Error message is now:

Fehler in .jcall(ps, "V", "setNull", as.integer(3), 4) :

  method setNull with signature (ID)V not found

Supporter

Re: Load Nulls with R & JDBC Fastload Problem

P.S. I am not a Java programmer.

Does null loads with JDBC Fastload work in a pure Java implementation?

So is this clearly a R problem?

Teradata Employee

Re: Load Nulls with R & JDBC Fastload Problem

Yes, JDBC FastLoad supports NULLs.

The signature (ID)V means that R is attempting to call a method that has int and double arguments, and returns void. Such a method doesn't exist; hence the error. In other words, the value 4 is treated by R as a double value. Instead, it needs to be an int value.

Please try:

.jcall(ps, "V", "setNull", as.integer(3), as.integer(4))

Supporter

Re: Load Nulls with R & JDBC Fastload Problem

Yes,yes,yes! That works!

Thanks a lot!

As I have to deal with different data types. Do you know where I can find the constanst for other data types?

e.g. Types.String etc.?

I didn't find them so far.

Supporter

Re: Load Nulls with R & JDBC Fastload Problem

Ok found it myself:

http://www.docjar.com/html/api/java/sql/Types.java.html

But thanks again to tomnolan

Teradata Employee

Re: Load Nulls with R & JDBC Fastload Problem

I'm glad you were able to find the constant values for java.sql.Types.

For reference, here is a link to the official Java documentation from Oracle:

http://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.ARRAY