MVC generated out of V14.00 statistics

Database
Enthusiast

MVC generated out of V14.00 statistics

The "SHOW STATISTICS VALUES COLUMN col ON db.tab; " shows the Biased values of the statistics. In my opinion this is exactly a very good list for multi value compression.

The following scripts should give a starting point for very fast multi value compression out of actual statistics. It is not tested for every feature the teradata database provides.

In worst case you loose statistics for one table, but this statistics is saved in the dump files and can submitted again.

The success of compression is connected to the amount of actual field statistics.

To limit the number of values a percentage of 1% is used to get the most used values. The number of null values for compression is also checked.

The advantages are:

  • No costs for getting the values for compression
  • Good compression results with easy algorithmus

This easy solution has some disadvantages:

  • Compress is only performed on columns with statistics
  • Statistics have to be actual
  • Procedure doesn't take care of previous values list

First generate as list of commands to get statistics for regenerating statistics:

SELECT

         'SHOW STATISTICS VALUES COLUMN '||(trim (both from a.columnname))||' on '||(trim(both from a.databasename))||'.'||(trim(both from a.tablename))||';' as stmt

FROM

        dbc.ColumnStatsV a

INNER JOIN

        dbc.columns b

ON

a.databasename=b.databasename

        AND

a.tablename=b.tablename

        AND

a.columnname=b.columnname

LEFT OUTER JOIN

        dbc.PartitioningConstraintsV c

ON

a.databasename=c.databasename

        AND

a.tablename=c.tablename

        AND

c.constrainttext LIKE '%'||a.columnname||'%'

WHERE

        c.constrainttext is null

        AND

a.indexnumber is null

        AND

a.databasename='dbtest'

        AND

b.columntype<>'TS'

        AND

(a.databasename,a.tablename,a.columnname) not in (select databasename,tablename,columnname from dbc.indices)

order by a.databasename,a.tablename,a.columnname;

Which produces:

SHOW STATISTICS VALUES COLUMN col1 on dbtest.tab1;

SHOW STATISTICS VALUES COLUMN col2 on dbtest.tab1;

SHOW STATISTICS VALUES COLUMN col3 on dbtest.tab1;

SHOW STATISTICS VALUES COLUMN col1 on dbtest.tab2;

SHOW STATISTICS VALUES COLUMN col2 on dbtest.tab2;

Putting the bteq output of these command in the following gawk script

BEGIN   { CUTPERCENTAGE=1;

          print ".errorlevel (3582) severity 0";

        }

/            COLUMN \(/ { COL=$3; }

/                ON / { DBTAB=$2; }

/^ \/\*\* / { BIASEDON=0; }

/NumOfRows/ { CUTROWS=$4*CUTPERCENTAGE/100;

              BIASED=="";

              if (0+CUTROWS<0+NULLROWS) BIASED="NULL,";

            }

/\/\* NumOfNulls/ { NULLROWS=$4; }

/^ \/\*\* Biased:/ { BIASEDON=1;}

/^ \/\* / { if (BIASEDON==1)

                {

                if (CUTROWS < 0+gensub(".*,","","",gensub(",? ?$","","g")))

                        BIASED=BIASED gensub("^ */[^/]*/","","g",gensub(",[0-9 ]*,? ?$","","g")) ",";

                }

        }

/^COLLECT STATISTICS/   { COLSTATON=1; }

        {       if (COLSTATON==1) COLSTAT=COLSTAT "\n" $0; }

/^);/   { BIASEDON=0;

        COLSTATON=0;

        if (BIASED=="")

                {

                COLSTAT="";

                next;

                }

        if (DBTAB!=DBTABOLD)

                {

                if (DBTABOLD!="")

                        {

                        print DROPSTATS;

                        print ALTERTABLE ";";

                        print COLSTATALL;

                        COLSTATALL="";

                        }

                ALTERTABLE="ALTER TABLE " DBTAB " ADD " COL " COMPRESS (" gensub(",$","","",BIASED) ")";                                                                                                                                     

                DROPSTATS="DROP STATISTICS COLUMN " COL " ON " DBTAB ";";

                DBTABOLD=DBTAB;

                }

        else

                {

                ALTERTABLE=ALTERTABLE "\n" "        ,ADD " COL " COMPRESS (" gensub(",$","","",BIASED) "                                                                                                                                     )";

                DROPSTATS=DROPSTATS "\n" "DROP STATISTICS COLUMN " COL " ON " DBTAB ";";

                }

        COLSTATALL=COLSTATALL "\n" COLSTAT;

        COLSTAT="";

        BIASED="";

        }

END     {

        print ";";

        }

Produces:

DROP STATISTICS column col1 on dbtest.tab1;

DROP STATISTICS column col2 on dbtest.tab1;

DROP STATISTICS column col3 on dbtest.tab1;

ALTER TABLE dbtest.tab1 add col1 compress ( ...)

    , add col2 compress ( ...)

    , add col3 compress ( ...)

;

COLLECT STATISTICS COLUMN ( col1 ) ON dbtest.tab1 VALUES (...);

COLLECT STATISTICS COLUMN ( col2 ) ON dbtest.tab1 VALUES (...);

COLLECT STATISTICS COLUMN ( col3 ) ON dbtest.tab1 VALUES (...);

DROP STATISTICS column col1 on dbtest.tab2;

DROP STATISTICS column col2 on dbtest.tab2;

ALTER TABLE dbtest.tab2 add col1 compress ( ...)

    , add col2 compress ( ...)

;

COLLECT STATISTICS COLUMN ( col1 ) ON dbtest.tab2 VALUES (...);

COLLECT STATISTICS COLUMN ( col2 ) ON dbtest.tab2 VALUES (...);

Executing this output compresses the fields. Done :-)

Comments, remarks and improvements are very welcomed.

Best Regards,

Roland Sagner

2 REPLIES
Enthusiast

Re: MVC generated out of V14.00 statistics

Hi Roland,

Thank you for sharing your work, It would be great if you share your thaughts in a blog in Developer Exchange.

Good Luck!

Khurram
Enthusiast

Re: MVC generated out of V14.00 statistics

Unfortunately version lacks of two facts:

* Biased values are truncated to 26 Characters in the "SHOW STATISTICS VALUES COLUMNSHOW STATISTICS VALUES COLUMN"

* Script fails when column is part of multi column statistics.

Therefore new awk script

BR

Roland

BEGIN   { CUTPERCENTAGE=1;

          print ".errorlevel (3582) severity 0";

          print ".errorlevel (6956) severity 0";

        }

/            COLUMN \(/ { COL=$3; }

/                ON / { DBTAB=$2; }

/^ \/\*\* / { BIASEDON=0; }

/NumOfRows/ { CUTROWS=$4*CUTPERCENTAGE/100;

              BIASED=="";

              if (0+CUTROWS<0+NULLROWS) BIASED="NULL,";

            }

/\/\* NumOfNulls/ { NULLROWS=$4; }

/^ \/\*\* Biased:/ { BIASEDON=1;}

/^ \/\* / { if (BIASEDON==1)

                {

                if (CUTROWS < 0+gensub(".*,","","",gensub(",? ?$","","g")))

                        if (length (gensub("^ */[^/]*/","","g",gensub(",[0-9 ]*,? ?$","","g")))<28)

                                BIASED=BIASED gensub("^ */[^/]*/","","g",gensub(",[0-9 ]*,? ?$","","g"))                                                                                                                                      ",";

                }

        }

/^COLLECT STATISTICS/   { COLSTATON=1; }

        {       if (COLSTATON==1) COLSTAT=COLSTAT "\n" $0; }

/^);/   { BIASEDON=0;

        COLSTATON=0;

        if (BIASED=="")

                {

                COLSTAT="";

                next;

                }

        if (DBTAB!=DBTABOLD)

                {

                if (DBTABOLD!="")

                        {

                        print DROPSTATS;

                        print ALTERTABLE ";";

                        print COLSTATALL;

                        COLSTATALL="";

                        }

                ALTERTABLE="ALTER TABLE " DBTAB " ADD " COL " COMPRESS (" gensub(",$","","",BIASED) ")";                                                                                                                                    

                DROPSTATS="DROP STATISTICS COLUMN " COL " ON " DBTAB ";";

                DBTABOLD=DBTAB;

                }

        else

                {

                ALTERTABLE=ALTERTABLE "\n" "        ,ADD " COL " COMPRESS (" gensub(",$","","",BIASED) "                                                                                                                                     )";

                DROPSTATS=DROPSTATS "\n" "DROP STATISTICS COLUMN " COL " ON " DBTAB ";";

                }

        COLSTATALL=COLSTATALL "\n" COLSTAT;

        COLSTAT="";

        BIASED="";

        }

END     {

        print ";";

        }