For Join Index - One point just for discussion

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Enthusiast

For Join Index - One point just for discussion

CREATE MULTISET TABLE  table_1

     (

      col1 INTEGER,

      col2 VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,

      col3 INTEGER NOT NULL,

      col4 TIMESTAMP(0),

      col5 VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC,

      col6 INTEGER ,

      col7 INTEGER    

      )

PRIMARY INDEX ( col1 ,col5 )

; --successful

 

CREATE JOIN INDEX JI_table_1

AS

SELECT

      col1

      ,col2

      ,col3

      ,col4

      ,col5

      ,col6

      ,col7

      ,ROWID

FROM  table_1

PRIMARY INDEX (ROWID)

;--successful

 

COLLECT STATS COLUMN(ROWID) ON JI_table_1; -- IT WILL FAIL

 

DROP JOIN INDEX JI_table_1

;--successful

 

CREATE JOIN INDEX JI_table_1

AS

SELECT

      col1

      ,col2

      ,col3

      ,col4

      ,col5

      ,col6

      ,col7

      ,ROWID as ROWID_1

FROM  table_1

PRIMARY INDEX (ROWID_1)

;--successful

 

COLLECT STATS COLUMN(ROWID_1) ON JI_table_1; -- IT WILL PASS

 

Bullet Points:

  1. ROWID should not be allowed to be used in PI clause of JI
  2. As it is allowed it will hashed for distribution of rows but as ROWID cannot be explicitly used in WHERE (or there is no meaning for using it if it is allowed)
  3. Confusion as it is allowed in PI clause of creating JI but not allowed while collecting stats
6 REPLIES
Senior Apprentice

Re: For Join Index - One point just for discussion

The oprimizer doesn't prevent you from doing stupid things :-)

 

When you specify ROWID in your Collect Stats it's the ROWID of the row in the JI, not the ROWID of the base table.

If you really want to do that you have to double quote it, thus referencing the original value:

COLLECT STATS COLUMN("ROWID") ON JI_table_1; 

Would be similar for using it in WHERE, but why would you want to do that?

 

Enthusiast

Re: For Join Index - One point just for discussion

Smiley LOL

 

That is the point, it will be really stupid. Thats why it shouldnt be allowed 

 

Senior Apprentice

Re: For Join Index - One point just for discussion

There are many stupid things, it's impossible to disallow all of them.

 

And Oracle people like to use ROWID and that's the only way to do it :)

 

In fact the possibility to access ROWID has been disabled mainly due to (Oracle) developers using it in WHERE.

Enthusiast

Re: For Join Index - One point just for discussion

One more on ORDER BY CLAUSE in JI- Just for records

CREATE MULTISET TABLE table_1
(
col11 INTEGER,
col12 VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
col13 INTEGER UNIQUE NOT NULL,
col14 TIMESTAMP(0),
col15 VARCHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC,
col16 INTEGER ,
col17 DATE
)
PRIMARY INDEX ( col11 ,col15 )
; -- Successful

CREATE JOIN INDEX JI_TABLE_1
AS
SELECT
col11
,col12
,col13
,col14
,col15
,col16
,col17
FROM table_1
ORDER BY col11
PRIMARY INDEX(col13)
; -- Successful

SHOW JOIN INDEX JI_TABLE_1;

CREATE JOIN INDEX JI_TABLE_1
SELECT table_1.col11
,table_1.col12
,table_1.col13
,table_1.col14
,table_1.col15
,table_1.col16
,table_1.col17
FROM table_1
ORDER BY table_1.col11 ASC
PRIMARY INDEX ( col13 );

Explain CREATE JOIN INDEX JI_TABLE_1
AS
SELECT
col11
,col12
,col13
,col14
,col15
,col16
,col17
FROM table_1
ORDER BY col11
PRIMARY INDEX(col13)

1) First, we lock a distinct "pseudo table" for read on a
RowHash to prevent global deadlock for table_1.
2) Next, we lock JI_TABLE_1 for exclusive use, and we lock
table_1 for read.
3) We lock a distinct "pseudo table" for write on a RowHash for
deadlock prevention, we lock a distinct "pseudo table" for
read on a RowHash for deadlock prevention, and we lock a distinct
"pseudo table" for write on a RowHash for deadlock prevention.
4) We lock Indexes for write on a RowHash, we lock DBase for
read on a RowHash, we lock TVFields for write on a RowHash,
and we lock TVM for write on a RowHash.
5) We execute the following steps in parallel.
1) We do a single-AMP ABORT test from DBase by way of the
unique primary index.
2) We do a single-AMP ABORT test from TVM by way of the
unique primary index.
3) We do an INSERT into Indexes (no lock required).
4) We do an INSERT into TVFields (no lock required).
5) We do an INSERT into TVFields (no lock required).
6) We do an INSERT into TVFields (no lock required).
7) We do an INSERT into TVFields (no lock required).
8) We do an INSERT into TVFields (no lock required).
9) We do an INSERT into TVFields (no lock required).
10) We do an INSERT into TVFields (no lock required).
11) We do an INSERT into TVM (no lock required).
6) We create the table header.
7) We do an all-AMPs RETRIEVE step from table_1 by way of an
all-rows scan with no residual conditions into Spool 1 (all_amps),
which is redistributed by the hash code of (
table_1.col13) to all AMPs. Then we do a SORT to order
Spool 1 by row hash. The size of Spool 1 is estimated with low
confidence to be 144 rows (7,200 bytes). The estimated time for
this step is 0.01 seconds.
8) We do an all-AMPs MERGE into JI_TABLE_1 from Spool 1
(Last Use). The size is estimated with low confidence to be 144
rows. The estimated time for this step is 1 second.
9) We create the index subtable on table_1.
10) We lock Indexes for write on a RowHash, and we lock TVM
for write on a RowHash.
11) We execute the following steps in parallel.
1) We do an INSERT into Indexes.
2) We do an INSERT into Indexes.
3) We do an INSERT into Indexes.
4) We do an INSERT into Indexes.
5) We do an INSERT into Indexes.
6) We do an INSERT into Indexes.
7) We do an INSERT into Indexes.
8) We do a single-AMP UPDATE from TVM by way of the unique
primary index with no residual conditions.
12) We lock a distinct "pseudo table" for exclusive use on a
RowHash to prevent global deadlock for table_1.
13) We lock table_1 for exclusive use.
14) We modify the table header table_1 and update the table's
version number.
15) We lock AccessRights for write on a RowHash.
16) We INSERT default rights to AccessRights for
JI_TABLE_1.
17) We spoil the parser's dictionary cache for the table.
18) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.


While creating JI we have used ORDER BY, but it seems even it is there, its of no use.
1. It is no where seen in EXPLAIN
2. It will not have any role while creating JI table, as data distribution will on PI of JI
3. It will not play any role in choosing JI by PE for SQL execution

Senior Apprentice

Re: For Join Index - One point just for discussion

Distribution is based on col13, but the rows are sorted by col11 within AMP (similar to a value ordered NUSI or partitioning).

 

You're explaining the CREATE, not a SELECT.

 

Explain a SELECT WHERE col11 BETWEEN and you'll find a RETRIEVE step with a range constraint. 

Enthusiast

Re: For Join Index - One point just for discussion

Never thought it will behave like value order NUSI.

 

Thanks, its something new for me