Hi New to the forum. We are trying out Clementine 9.0 s/w with our Teradata DW for some customer analytic applications. I have not been able to find a way to deploy the Clementine stream or just the Clementine decision tree models as SQL to be executed in Teradata. I can export the models as PMML - but what do I need on Teradata side to execute this ? Do I need to have Teradata Miner ? Does any one have experience of using Clementine with Teradata and have they managed to deploy these models for execution in Teradata ? Appreciate any help / advice Jay Gurjar
I've experienced this problem, more often with DB2 on OS/Z series mainframe version (version 7 if I recall). It only supported 32k bytes of sql query. Oracle 10g and SQL Server 2000 are at 2MB sql limit i think.
These limits on the size of the sql query you can submit can occur more frequently if your stream isn't constructed effeciently, but it might just be that the end result is you have a very complex and huge sql query. Try aggregating data to your analysis level asap.
I'm currently working on Teradata and have found it more common to encounter spool space issues (I have 50 allocated) than sql query size limits (i was searching for spool space hints when I found this query..)
In Clementine you can enable a cache to force the result set at that point to be written to disk (on your application server or local machine). In Clementine 10 (10.1 is the current version) you can create violatile/temp tables within the DB. This would help break up the sql query into smaller piices without sacrificing much in terms of performance.
At times when the stream/query was simply huge, I've overcome the problem by writting out the data to a DB table at a logical break point within the stream/query. This has a performance hit, but it runs...
Alternatively consider running the query against less data (ie. 3 months customer history instead of 6 months), then run it against the remainder.
An update on this query; - One of my colleagues got a SQL limit error returned from Teradata the other day. They had created over 1MB of SQL (it seems the SQL limit for our version of Teradata is 1MB, which is pretty good).
Lets put this into prespective, quickly scanning the SQL I reckon its about 50 pages long! It was excessive because the predictive/forecase model was overly detailed. The model (converted into SQL) described customer groups down to the 3-5 customer level. Basicaly, a series of CASE statements that split 1 million customers into 300,000 distinct groups.
I helped my collegaue change the SPSS Clementine options to create a sensible smaller model that described customers groups of about 5k in size.
If you are exceeding the Teradata SQL limit when building and scoring predictive models then one possiblity is that your model is too detailed.