Avoid all-row scan in driving table using NUSI

Database

Avoid all-row scan in driving table using NUSI

Hi I am gettnig all-row scan even after using a secondary Index at the driving table. I am not sure if my approach is right , so please help me out here.

Sample snippets, we have classic scnario EMPOLYEE and DEPT where EMPLOYEE has PI on EID and DEPARTMENT has PI on DEPTID. On Inner JOin we are getting all-row scan since we are joining based upon DEPTID. So to avoid that I created a SI for EMPLOYEE on DEPTID and hoping that it will save the all-row scan but it did not. Where am I going wrong here?

Below is the full example

CREATE TABLE MY_DB.EMPLOYEE

(

EID NUMBER,

ENAME VARCHAR(10),

DEPTID NUMBER,

PROJID NUMBER,

PAYSCALEID NUMBER

)

PRIMARY INDEX (EID);

CREATE TABLE MY_DB.DEPARTMENT

(

DEPTID NUMBER,

DEPTNAME VARCHAR(10)

)

PRIMARY INDEX(DEPTID);

INSERT INTO MY_DB.EMPLOYEE VALUES (1, 'RAM',1,1,1);

INSERT INTO MY_DB.EMPLOYEE VALUES (1, 'RAM',1,2,1);

INSERT INTO MY_DB.EMPLOYEE VALUES (2, 'SHAM',1,1,1);

INSERT INTO MY_DB.EMPLOYEE VALUES (2, 'SHAM',1,2,1);

INSERT INTO MY_DB.EMPLOYEE VALUES (3, 'TOM',2,1,2);

INSERT INTO MY_DB.EMPLOYEE VALUES (3, 'TOM',1,2,2);

insert into MY_DB.DEPARTMENT values (1,'IT')

CREATE INDEX SI_DEPT (DEPTID) ON MY_DB.EMPLOYEE

COLLECT STATISTICS ON MY_DB.EMPLOYEE COLUMN DEPTID

EXPLAIN

SEL EMP.ENAME, DEPT.DEPTNAME  FROM MY_DB.EMPLOYEE EMP

INNER JOIN MY_DB.DEPARTMENT DEPT

ON

DEPT.DEPTID=EMP.DEPTID

================EXPLAIN PLAN ===========================

EXPLAIN

SEL EMP.ENAME, DEPT.DEPTNAME  FROM MY_DB.EMPLOYEE EMP

INNER JOIN MY_DB.DEPARTMENT DEPT

ON

DEPT.DEPTID=EMP.DEPTID

  1) First, we lock a distinct MY_DB."pseudo table" for read

     on a RowHash to prevent global deadlock for MY_DB.DEPT.

  2) Next, we lock a distinct MY_DB."pseudo table" for read

     on a RowHash to prevent global deadlock for MY_DB.EMP.

  3) We lock MY_DB.DEPT for read, and we lock

     MY_DB.EMP for read.

  4) We do an all-AMPs RETRIEVE step from MY_DB.EMP by way of

     an all-rows scan with no residual conditions into Spool 2

     (all_amps), which is redistributed by the hash code of (

     MY_DB.EMP.DEPTID) to all AMPs.  Then we do a SORT to

     order Spool 2 by row hash.  The size of Spool 2 is estimated with

     high confidence to be 6 rows (156 bytes).  The estimated time for

     this step is 0.01 seconds.

  5) We do an all-AMPs JOIN step from MY_DB.DEPT by way of a

     RowHash match scan, which is joined to Spool 2 (Last Use) by way

     of a RowHash match scan.  MY_DB.DEPT and Spool 2 are

     joined using a merge join, with a join condition of (

     "MY_DB.DEPT.DEPTID = DEPTID").  The result goes into

     Spool 1 (group_amps), which is built locally on the AMPs.  The

     size of Spool 1 is estimated with low confidence to be 8 rows (

     248 bytes).  The estimated time for this step is 0.04 seconds.

  6) Finally, we send out an END TRANSACTION step to all AMPs involved

     in processing the request.

  -> The contents of Spool 1 are sent back to the user as the result of

     statement 1.  The total estimated time is 0.06 seconds.

3 REPLIES

Re: Avoid all-row scan in driving table using NUSI

Its the Optimizer to choose the best cost effective plan. However, there are maultiple conditions underwhich SI will not be used. You may need to read more of USI/NUSI selection criteria.

Re: Avoid all-row scan in driving table using NUSI

Hi,

To use NUSI, the indexed column should be used in WHERE condistion.

Since you have placed no restrcition on any of the tables, it will obviously go for full table scan.

Khurram

Khurram
Teradata Employee

Re: Avoid all-row scan in driving table using NUSI

It is not ever possible to test NUSI usage on small data. If the data is only a block or two on each AMP, it is always cheaper to read the data blocks directly than to read the index structure, then go read the data blocks.

And as stated above, when the index is not qualified it is unlikely it will be used in a query. In general, the qualification has to be selective enough from the index so that fewer data blocks will be read than if the scan was performed. Otherwise, it is cheaper to scan than to read the index and then read all the data blocks anyway.