hi Teradata forumers,
Can you guys help figure an SQL? here's my problem. For my query tuning work, Im joining a fairly big table and a small one.The column being joined in the BIG table is only a NUSI, the smaller table is using a NUPI. As I know, NUSI cant have distribution of data so Im proposing to change the structure of the BIG table to use the column as PI buy, i was advised not to since the table is huge and will impact a lot of users. Now, replicating the big table aint an option since spool and temp space is limited. A JI can help but the inserts can be slower and might hamper the overall performance/timing. Any ideas guys? What could i use in order to have a good join condition?
Im bound to some NDA so I cant disclose much info publicly,
The TD servers im working on are 12-13.
Define "fairly big" and "fairly small".
A small table will probably be "duplicated to all AMPs" and then joined. Did you check DBQL if this step is actually consuming too much resources?
The BIG table had 500m records and the small one had 48m. The DBQL shows high CPU Impact, skewness and spool usage.
BTW I was told that our systrm is sitting on 680 AMPs.
Just my small cents :)
Yes , I do agree a JI will hamper the insert performance. I see that a single table JI can help a little.
How about partitioning a huge table? It is just my thought.
Thanks and regards,
Thanks, I'll try that out.
For everyone's record, what i did was a query rewrite. I found some columns that can be used to link together. Hopefully I can get a smiling face from my client tomorrow. Will keep you posted
Ok, 48m is not really "fairly small" :-)
Are the tables partitioned? Do you actually join all rows or are there some common filtering conditions? How often are those tables accessed? What other tables are joined to them?
All this (and more) must be evaluated and changing the PI should always be a possible result.
Without further details it's hard to assist.
Yes, Dieter, that's what I need to test now - to see if the rewrite did change the resultsets. I do found out that there are joins to NUPI columns and count of distinct is around 30% less than the count of the records. So I tend to do a subquery to distinct this column first and use this for joining, hope this could help lower the processing. I tried to maintain the join columns as the original query except for this one, but then again, i'm crossing my fingers right now.
There's a specific case i know of when it's not changing the result:
e.g. orders and lineitem, both with PI(ordernumber) PARTITON BY orderdate, the join based on the FK will be (ordernumber), but this will result in a very inefficient join, because it's not known in which partition a row is located. Adding the orderdate switches back to a direct merge join. It's hard to enshure that endusers know and apply that, so there's usually a view pre-joining the tables :-)
ah yes, i think i saw things like that dieter on my previous project.
Anyway, an update to everyone. I was advised by my partner to just make recommendation of a structural change since modifying the query will only be more taxing. The performance gain I got from above was very negligible either I wasn't a good query writer or the design is not applicable anymore.
Thanks to everyone who helped :)