Fetch performance: TeradataStudio vs. Python teradata library

Teradata Studio
Highlighted
Fan

Fetch performance: TeradataStudio vs. Python teradata library

Hi,

 

We have an application that needs to offload data on a daily basis from Teradata. The application is written in Python and uses the Python Teradata library (version 15.10.0.21). Functionally it works, but downloading of data is really slow. Execution of the query is OK-ish, about 2 minutes (table has >400M rows), however the actual offloading of the records is slow.

 

For example:

  • offloading 1.1M rows of data takes ~1 hour using the python application (data volume 660MB)
  • whereas, offloading the exact same data using Teradata Studio only takes 15 minutes

So offloading using Teradata Studio is 4 times faster (both situations were run from the same machine, using the same connection to Teradata).

 

Can someone explain why Teradata Studio is so much faster? Is there some (undocumented) setting that we need to do in the Python library to make it faster? Are there alternatives?

 

I also tried sqlalchemy i.c.m. with tdodbc1620__linux_indep.16.20.00.50-1 driver and that performed 10% slower than the Python Teradata library. 

Any help is appreciated. 

 

Regards,

Gero