I am running into a spool space problem on tables that I just created, which had no statistics. I employed the DIAGNOSTIC HELPSTATS ON FOR SESSION in order to get the recommended stats (admittedly, this was just so I didn't have to type them all). When I started to grab that stats that I thought would be helpful, I noticed the following recommendations:
-- COLLECT STATISTICS COLUMN (ColA, ColB) ON MYDB.MYTBL (High Confidence)
-- COLLECT STATISTICS COLUMN (ColB, ColA) ON MYDB.MYTBL (Medium Confidence)
I am confused on why the order of the columns matters, as they would have the same row counts/cardinalities either way? Is this supposed to happen? If so, why, i.e. what different information does Teradata glean from having both stats if the row counts are the same?
I am running on Terdata 15.00.
Many thanks in advance.
Starting with TD14.0, the order given for COLUMN stats is respected. If columns that are being used in single-table filter predicates are listed first, followed by columns that are used for join criteria or grouping, then the optimizer can sometimes get a better estimate for that join / group by (using a subset of the stats histogram).