A wider test case on R JDBC fastload

Blog
The best minds from Teradata, our partners, and customers blog about relevant topics and features.
Senior Supporter
Hi,

this is an extension of the 


blog. 

The initial idea and code example comes from amarek!

For what ever reason I was not able to post the code below with an comment. There seems to be an issue with the developer.teradata.com page.

Modify dim to generate more or less data. 

1 mio rows had been loaded in 5 min in my environment. Apply might be faster but didn't got this working right away.

Next challange would be to make this more generic in a way to be able to create a TD table for a given data.frame and load the data.frame afterwards ;-).

library(RJDBC)
################
#def functions
################
myinsert <- function(arg1,arg2){
.jcall(ps,"V","setInt",as.integer(1),as.integer(arg1))
.jcall(ps,"V","setString",as.integer(2),arg2)
.jcall(ps,"V","addBatch")
}

MHmakeRandomString <- function(n=1, lenght=12)
{
randomString <- c(1:n) # initialize vector
for (i in 1:n)
{
randomString[i] <- paste(sample(c(0:9, letters, LETTERS),
lenght, replace=TRUE),
collapse="")
}
return(randomString)
}

################
#DB Connect
################
.jaddClassPath("/MyPath/terajdbc4.jar")
.jaddClassPath("/MyPath/tdgssconfig.jar")
drv = JDBC("com.teradata.jdbc.TeraDriver","/MyPath/tdgssconfig.jar","/MyPath/terajdbc4.jar")
conn = dbConnect(drv,"jdbc:teradata://MyServer/CHARSET=UTF8,LOG=ERROR,DBS_PORT=1025,TYPE=FASTLOAD,TMODE=TERA,SESSIONS=1","user","password")

################
#main
################

##gen test data
dim = 1000000
i = 1:dim
s = MHmakeRandomString(dim,12)

## set up table
dbSendUpdate(conn,"drop table foo;")
dbSendUpdate(conn,"create table foo (a int, b varchar(100));")

#set autocommit false
.jcall(conn@jc,"V","setAutoCommit",FALSE)
##prepare
ps = .jcall(conn@jc,"Ljava/sql/PreparedStatement;","prepareStatement","insert into foo values(?,?)")

#start time
ptm <- proc.time()

## batch insert
for(n in 1:dim){
myinsert(i[[n]],s[[n]])
}
#run time
proc.time() - ptm

#apply & commit
.jcall(ps,"[I","executeBatch")
dbCommit(conn)
.jcall(ps,"V","close")
.jcall(conn@jc,"V","setAutoCommit",TRUE)

#get some sample results
dbGetQuery(conn,"select top 100 * from foo")
dbGetQuery(conn,"select count(*) from foo")

#disconnect
dbDisconnect(conn)

5 Comments
Not applicable

Hi, this is great!  Do you know if it's possible to pass a vector to 

.jcall(ps,"V","setInt",as.integer(1),as.integer(arg1)

that way we would be able to construct the calls programatically to export data frames

Senior Supporter

Hi, didn't tried it so far.

Have you in the mean time?

Maybe I find next week some time to check this...

Ulrich

Not applicable

Sorry, I'm not too familiar with the jcall command. In this scenario, what does 'as.integer(1)' do for

 .jcall(ps,"V","setInt",as.integer(1),as.integer(arg1))

Is it naming that column?

Enthusiast

Hi Ulrich, great post. 2 questions: 

 

  1. Were you able to do this? "Next challange would be to make this more generic in a way to be able to create a TD table for a given data.frame and load the data.frame afterwards" if yes I would be interested in receiving a copy of the module if possible.
  2. I'm on a mac with R studio trying to load large amount of records in teradata via a data frame. I'm going to try your above example and modify accordingly to load my table. do you need to install teradata fastload or tpt separately on your mac to make the above example work? or just by referring the 2 jar files: terajdbc4.jar and tdgssconfig.jar, you were able to run fastload?

-Suhail

Teradata Employee

@suhailmemon84 No, you don't need to install Teradata FastLoad or TPT separately in order to use JDBC FastLoad. Please refer to the Teradata JDBC Driver User Guide for more information on using JDBC FastLoad:

http://developer.teradata.com/doc/connectivity/jdbc/reference/current/jdbcug_chapter_2.html#BABFGFAF