Low cardinality with [Is Null] Predicate for an Indexed Timestamp column

Database
Enthusiast

Low cardinality with [Is Null] Predicate for an Indexed Timestamp column

A Timestamp indexed column suffers from very low cardinality when using [IS NUL] as predicate which estimates 1 row while actually 57264480 returned.

The column had index and stats collected on it.

Table definition

~~~~~~~~~~~~~~~~~~~~~~~~

CREATE MULTISET TABLE TDB.TABLE1,NO FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT

     (

      Rate_Plan_Product_Id INTEGER NOT NULL,

      Subscription_Id INTEGER NOT NULL,

      Rate_Plan_Subscription_Start_D TIMESTAMP(0) NOT NULL,

      Rate_Plan_Subscription_End_Dtt TIMESTAMP(0),            <<<<< Column in question.

      DWH_Entry_Date DATE FORMAT 'yyyy/mm/dd')

PRIMARY INDEX ( Subscription_Id )

INDEX RATE_PLAN_SUB_ST_D_IDX ( Rate_Plan_Subscription_Start_D )

INDEX RATE_PLAN_SUB_END_DT_IDX ( Rate_Plan_Subscription_End_Dtt )

INDEX DWH_ENTRY_DATE_NUSI ( DWH_Entry_Date );

ACTUAL query output

~~~~~~~~~~~~

SELECT COUNT(*) FROM db1.table1

WHERE  Rate_Plan_Subscription_End_Dtt IS NULL;

>>>> 57264480

query Estimates

~~~~~~~~~~~~

Explain SELECT COUNT(*) FROM db1.table1

WHERE  Rate_Plan_Subscription_End_Dtt IS NULL;

  1) First, we lock a distinct tdb."pseudo table" for read on a RowHash

     to prevent global deadlock for db1.table1.

  2) Next, we lock db1.table1 for read.

  3) We do an all-AMPs SUM step to aggregate from

     db1.table1 by way of index # 8 without

     accessing the base table

     "db1.table1.Rate_Plan_Subscription_End_Dtt =

     NULL" with no residual conditions.  Aggregate Intermediate Results

     are computed globally, then placed in Spool 3.  The size of Spool

     3 is estimated with high confidence to be 1 row (23 bytes).  The

     estimated time for this step is 0.13 seconds.

  4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of

     an all-rows scan into Spool 1 (group_amps), which is built locally

     on the AMPs.  The size of Spool 1 is estimated with high

     confidence to be 1 row (25 bytes).  The estimated time for this

     step is 0.01 seconds.

  5) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 1 are sent back to the user as the result of

     statement 1.  The total estimated time is 0.14 seconds.

SELECT MAX(Rate_Plan_Subscription_End_Dtt), MIN(Rate_Plan_Subscription_End_Dtt)

FROM db1.table1;

>> 05-Feb-12 23:58:38    03-Apr-07 16:31:41

1 REPLY
Senior Apprentice

Re: Low cardinality with [Is Null] Predicate for an Indexed Timestamp column

Why do you think it's a wrong estimate?

COUNT(*) returns a single row, so it's 100% correct.

Explain a "SELECT * FROM table" to see the estimated number of NULLs.

Dieter