Collect stats index vs multiple columns: advantages?

Database

Collect stats index vs multiple columns: advantages?

Hi to all, I would like to know what is the difference in collecting stats for an index as opposed to collecting stats for multiple columns?
_____________________________________________________________________________

COLLECT STATS ON dbname.tablename COLUMN(col1, col2, etc.);

They can also be collected on an index:

COLLECT STATS ON dbname.tablename INDEX(col1,col2,etc.);

Thanks,
6 REPLIES
Teradata Employee

Re: Collect stats index vs multiple columns: advantages?

If there is a corresponding index, the action taken by the database is identical regardless of which syntax you use.

The only time it matters is if you first collect stats on multiple columns and later define an index on that same list of columns. In that case, you would have to DROP the original multi-COLUMN stats before you could COLLECT stats on the INDEX.
N/A

Re: Collect stats index vs multiple columns: advantages?

Is there a performance difference btw a column in where clause which has collect stats on column or collect stats on index?

Re: Collect stats index vs multiple columns: advantages?

If your question is whether there's a performance difference where you have a column with an index on it and the stats are collected on either the index or column ... the answer is no, they both have the same effect on performance.

If you meant to ask whether there's a performance difference where you have a column (with no index) and you collect stats on it versus you have the column with an index on it and have collected stats for it, the answer is yes.
N/A

Re: Collect stats index vs multiple columns: advantages?

Hi Joe,

Thanks for the quick response. My question of the second case. What could be the reason for the difference in performance when using a index column with using a non index column in where clause having collected stats for either of them?

Regards,
Sakthi

Re: Collect stats index vs multiple columns: advantages?

That's because when there's an index it can be used to reach the record without doing a full table scan. so it's faster. Now if you don't have an index, you have to do a full table scan to fetch the records even if you have stats collected on it.

So the question is why collect stats on columns without indexes ?

well it kind of helps optimizer at other places say like you have a query (hypothetical)

SELECT *
FROM A INNER JOIN B
ON A.COL1 = B.COL1
WHERE A.COL2 BETWEEN 10 AND 100000
AND B.COL2 BETWEEN 10 AND 100000

now assume COL1 is not a PI for neither A or B and there are no indexes on COL2 for both A and B.
(and we don't have any stats)
So optimizer if going to do a full table scan on both A and B, with a condition of COL2 between 10 and 100000 and then redistribute it based on row hash of COL1 to spool and then do the joins. (so both table A and B gets redistributed)

Now let's say Table B had only 10 records which had COL2 values that fell between 10 and 100000. So a better optimizer plan would have been to do a full table scan of Table B with condition COL2 between 10 and 100000 then duplicate it in all AMPS. Table A could be spooled locally and needn't get redistributed.

So we have a plan with minimal record redistribution now, which is faster. now optimizer could have made the second decision if it had stats on COL2 of table B.

That's one of those uses... I am sure there are more to it... (I haven't warmed up to my morning coffee yet ... :-) )
N/A

Re: Collect stats index vs multiple columns: advantages?

Hi Joe,

That was a gr8 explanation. Thanks for your inputs:-)