Speed up your JDBC/ODBC applications

Connectivity
Connectivity covers the mechanisms for connecting to the Teradata Database, including driver connectivity via JDBC or ODBC.
Teradata Employee

Speed up your JDBC/ODBC applications

The Teradata JDBC Driver and ODBC Driver allow developers to quickly build applications that interact with the Teradata Database. However, many developers are surprised when their fully functioning application suddenly hits a performance roadblock when it is deployed to their production environment. And in many of these cases, the blame is sometimes unfairly placed onto the JDBC and ODBC drivers. This article will highlight the programming techniques available to maximize the performance when interacting with the database and help developers choose the right implementation.

Quick and Easy but Slowest Performance

Many new database developers are more focused on how to create a database connection and pass a SQL statement than they are with performance. A typical first implementation looks something like:

Connection conn = DriverManager.getConnection(url, username, password);
Statement stmt = conn.createStatement();
String sql = "insert into Transactions(custID, transaction_date, amount, desc) values(" + custID + ", " + tran_date + ", " + amount + ", " + desc + "')";

stmt.executeUpdate(sql);

stmt.close(); // Your real code should use try-finally blocks to manage resources.
conn.close(); // Let's not even get into connection pools! That's another article.

Sure this works for a demo and the beginning programmer is probably pretty happy with the results. But turn on some production volume and this will quickly become a performance bottleneck, especially when your application is processing many SQL inserts such as when batch loading. This type of database coding is pretty much like driving your sports car and staying stuck in first gear!

Drivers Prepare Your Statements

A much better approach is to use Prepared Statements. These will provide significantly better performance by first sending the database the outlines of the SQL statement using variable parameters in place of the actual data. The database prepares the execution steps of the SQL statement to optimize performance, and the prepared statement can then be used over and over again. This avoids recalculating the execution steps for each individual request, which is what happens in the first example.

// These are done once …
Connection conn = DriverManager.getConnection(url, username, password);
String sql = "insert into Transactions(custID, transaction_date, amount, desc) values(?,?,?,?)";

PreparedStatement ps = conn.prepareStatement(sql);

// … and these can be repeated many times with different values.
ps.setInt(1, custID);
ps.setDate(2, tran_date);
ps.setBigDecimal(3, amount);
ps.setString(4, desc);

ps.executeUpdate();

Batch Ready

Prepared Statement batches take your performance to the next level. In addition to the benefits of reusing the Prepared Statement, batching your input values also reduces the number of round trips to the database. A batch size of roughly 5,000 to 10,000 works well for most applications. Using batches can be 10 to 40 times faster than the previous approach.

// These are done once.
Connection conn = DriverManager.getConnection(url, username, password);
String sql = "insert into Transactions(custID, transaction_date, amount, desc) values(?,?,?,?)";
PreparedStatement ps = conn.prepareStatement(sql);

for ( /* Loop through input values */ )
{
for ( /* Loop through a subset of the input values - the desired batch size */ )
{
ps.setInt(1, custID);
ps.setDate(2, tran_date);
ps.setBigDecimal(3, amount);
ps.setString(4, desc);
ps.addBatch(); // adds the row of input values to the batch
}

// This is done once per the desired batch size.
ps.executeBatch(); // sends all the batched rows to the database
}

Full Speed Ahead

For loading truly huge amounts of data, JDBC FastLoad can provide even better performance. There are a couple of caveats, however. JDBC FastLoad can only insert data into an empty table, and JDBC FastLoad is only recommended for loading large amounts of data -- at least 100,000 rows total.

The nice thing is that your Java code doesn't need to change in order to use JDBC FastLoad. Your application uses the exact same Prepared Statement batches as in the previous example. Just add TYPE=FASTLOAD to your connection parameters, and the Teradata JDBC Driver will use JDBC FastLoad for particular SQL requests, if it can.

Note that the recommended batch size for JDBC FastLoad is much higher than for a regular SQL Prepared Statement batch, which means you may need to increase your JVM heap size. To get top-notch performance, you need to use a batch size of roughly 50,000 to 100,000. Using JDBC FastLoad can be 3 to 10 times faster than the previous approach.

Conclusion

JDBC and ODBC allow C/C++ and Java programmers to easily build database applications with Teradata. Planning for maximizing performance throughput should always be on your mind when you're coding. Its much better to avoid these types of issues early instead of dealing with a fire drill when your new application has already been rolled out to production. I hope that this quick tutorial has given you a good overview of the different coding choices and their performance implications when interacting with Teradata.

Tags (3)
20 REPLIES
Supporter

Re: Speed up your JDBC/ODBC applications

Hi, in "Drivers Prepare Your Statements" you mention "and the prepared statement can then be used over and over again". Will this use the statement caching feature of the PE or will it always be use the same plan, independent on what is going on on the system?
Teradata Employee

Re: Speed up your JDBC/ODBC applications

Yes, the Teradata Database's statement cache will typically be used.
Enthusiast

Re: Speed up your JDBC/ODBC applications

Can a statement like this be useful for Prepared Statement?
sel * ..... WHERE cast(CollectTimeStamp as timestamp) > cast('?');

Re: Speed up your JDBC/ODBC applications

Thanks for the nice article. Just like to clarify some confusion:

As per Teradata JDBC Driver User guide
"Highest JDBC FastLoad PreparedStatement batch
using question-mark parameters, with the
recommended batch size. A batch size of
roughly 500 to 1000 works well for most
applications.
"
Still you say:
"To get top-notch performance, you need to use a batch size of roughly 50,000 to 100,000. "

Now -- what is the recommended approach or am I just missing some point here completely.

Should the JDBC Fastload work if I insert small (10K or so rows) batches to make complete 100K lines and just wrap it inside a transaction?
Teradata Employee

Re: Speed up your JDBC/ODBC applications

>>> Can a statement like this be useful for Prepared Statement? sel * ..... WHERE cast(CollectTimeStamp as timestamp) > cast('?')

That's not the right syntax.

sel * ..... WHERE cast(CollectTimeStamp as timestamp) > cast(? as timestamp)

Please note that the cast(? as timestamp) is not needed if your Java application binds a java.sql.Timestamp object to the question-mark parameter marker using the PreparedStatement.setTimestamp method, or the PreparedStatement.setObject method.

>>> A batch size of roughly 500 to 1000 works well for most applications.

That was a typo in that old version of the Teradata JDBC Driver User Guide. We specifically note that typo in the Teradata JDBC Driver FAQ here on Developer Exchange.

So Yes, you should use a much larger batch size, as recommended in the article above.
Enthusiast

Re: Speed up your JDBC/ODBC applications

Since Fastload can load only in empty table, it will not work in batch mode. Each batch needs to be loaded into separate table.
Teradata Employee

Re: Speed up your JDBC/ODBC applications

That's incorrect. In fact, JDBC FastLoad *must* be used with a PreparedStatement batch.
Please refer to the Teradata JDBC Driver Reference section about JDBC FastLoad.
http://developer.teradata.com/doc/connectivity/jdbc/reference/current/jdbcug_chapter_2.html#BABFGFAF
Teradata Employee

Re: Speed up your JDBC/ODBC applications

Hi, are there similar usages to enable FastLoad/FastExport for Teradata ODBC and ADO.Net provider? Thanks
Fan

Re: Speed up your JDBC/ODBC applications

Hello,

I'm using the FASTLOAD option, and it was working in our old version of TD, but since updating to TD 14, I now get this error unless I have my batch sizes less than 16k (which defeates the purpose)

Caused by: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 13.10.00.31] [Error 5966] [SQLState HY000] Too many data records packed in one USING row.

 

It would seem then that it really isn't a data issue and is an issue whenever you load more than ~16k rows.

How can I get around this issue?