In which case we go for multi column stats .
when we have a clause where table1.id=table2.id1 and table1.flag=table2.flag1 -- here
we can seperately collect stats on flag and id. There is no need for multi column stats.
Could somebody explain for which data demographoics teradata cannot predict the column details and hence we go for multicolumn stats??
This link maybe old but it gives vivid explanation why indexes are used. Statistics are collected for the goal of performance. We need to be careful when we choose columns for collect stats.
Besides, there are other links from Carrie, you can get easily.
There are two recomnedations to collect multicolumn stats:
So whenever you know that some columns will be used in conditions together, you can collect stats.
thanks both for your replies..
@saeed --> i understood your second point.
But i differ in the 1st , i , i have seen even if the join conditions are together, collecting single column stats use to give me same estimatedcount and actual count .Its a lil tricky :) :)
Or is it something like even after collecting single column stats, in PMON actual and estimated row count differs , we can go for multi column .
Or is there any thump rule for it.?
It is just a recomendation I found in TD docs. It is good that you are getting the same estimates in both cases, But It is recomeneded that when you are using the columns together in equality conditions, then you can collect stats on both of them combined. you can collect stats on individual columns as well.
It is not that much tricky, just take it easy :)
I'm not sure if this applies to 14, but prior versions are a bit tricky when collecting multicolumn stats. The optimizer only looks at the first 16 bytes of data. For example, if I am collecting on (desc,code) and description is over 16 bytes, and the majority of the demographics are duplicates on the first 16 bytes, this will not be helpful. The order in which the optimizer looks at the demographics in multicolumn stats is NOT the order in which the collect statement is executed (i.e. collect stats on tablename column(code,desc)), but rather the order in which the columns are defined in the table. If the order in the table is code then desc, then you will be ok no matter what order they are defined in the collect stats statement. It is a good practice to define shorter length columns before the longer length columns, especially if you suspect the groupings could be used in multi columns statistics collection. I don't think this applies in 14.0, but I could be wrong.
Thank you for sharing your thoughts, But this limitation has been eliminated in TD 14. Now the multicolumn stats are not truncated :)
Thanks Saeed!... that is what I thought. For those customers on versions prior to 14.00, it is good to know that order matters when it comes to multi-column statistics collection. Nothing worse then collecting on the correct columns only to have them defined in the wrong order in the table.