PPI explain

Database
Enthusiast

PPI explain

Below are the two tables created vinod_5 and vinod_6, the problem is of those two queries where if all the PI are considered and only four columns specified in the (2nd ) query then the explain plan is different.

The explain plan is same if there is no data, but if there is data (at present 82K) then there is difference.

Why is it So? Any suggestions....

=========================================================================

CREATE SET TABLE MSTRM600G.vinod_5 ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
Partition_Id INTEGER NOT NULL,
"1_Id" INTEGER NOT NULL,
"2_Id" INTEGER NOT NULL,
"3_Id" INTEGER NOT NULL,
"4_Id" TIMESTAMP(6) NOT NULL,
"5_Id" INTEGER,
"6_Txt" VARCHAR(40) CHARACTER SET UNICODE NOT CASESPECIFIC)
PRIMARY INDEX Pi_vinod_5 ( Partition_Id ,"1_Id" ,"2_Id" )
PARTITION BY Partition_Id ;

CREATE SET TABLE MSTRM600G.vinod_6 ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
Partition_Id INTEGER NOT NULL,
"1_Id" INTEGER NOT NULL,
"2_Id" INTEGER NOT NULL,
"3_Id" INTEGER NOT NULL,
"4_Id" TIMESTAMP(6) NOT NULL,
"5_Id" INTEGER,
"6_Txt" VARCHAR(40) CHARACTER SET UNICODE NOT CASESPECIFIC)
PRIMARY INDEX Pi_vinod_5 ( Partition_Id ,"1_Id" ,"2_Id" )
PARTITION BY Partition_Id ;

1)

EXPLAIN select *
From vinod_5 a JOIN vinod_6 b
ON
a.Partition_Id = b.Partition_Id AND
a."1_Id" = b. "1_Id" AND
a."2_Id" = b. "2_Id" AND
a."3_Id" = b."3_Id" And
a."4_Id" = b."4_Id" ANd
A.Partition_Id = 13

Explanation
1) First, we lock a distinct MSTRM600G."pseudo table" for read on a
RowHash to prevent global deadlock for MSTRM600G.b.
2) Next, we lock a distinct MSTRM600G."pseudo table" for read on a
RowHash to prevent global deadlock for MSTRM600G.a.
3) We lock MSTRM600G.b for read, and we lock MSTRM600G.a for read.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from a single partition of
MSTRM600G.b with a condition of ("MSTRM600G.b.Partition_Id =
13") with a residual condition of (
"(MSTRM600G.b.Partition_Id = 13) AND (NOT (MSTRM600G.b."5_Id"
IS NULL ))") into Spool 2 (all_amps), which is redistributed
by hash code to all AMPs. Then we do a SORT to order Spool 2
by row hash. The size of Spool 2 is estimated with no
confidence to be 77,979 rows. The estimated time for this
step is 0.52 seconds.
2) We do an all-AMPs RETRIEVE step from a single partition of
MSTRM600G.a with a condition of ("MSTRM600G.A.Partition_Id =
13") with a residual condition of (
"(MSTRM600G.A.Partition_Id = 13) AND (NOT (MSTRM600G.a."5_Id"
IS NULL ))") into Spool 3 (all_amps), which is redistributed
by hash code to all AMPs. Then we do a SORT to order Spool 3
by row hash. The size of Spool 3 is estimated with no
confidence to be 85,117 rows. The estimated time for this
step is 0.55 seconds.
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a
RowHash match scan, which is joined to Spool 3 (Last Use) by way
of a RowHash match scan. Spool 2 and Spool 3 are joined using a
merge join, with a join condition of ("(Partition_Id =
Partition_Id) AND (("1_Id" = "1_Id") AND (("2_Id" = "2_Id") AND
(("3_Id" = "3_Id") AND (("4_Id" = "4_Id") AND ("5_Id" = "5_Id"
)))))"). The result goes into Spool 1 (group_amps), which is
built locally on the AMPs. The size of Spool 1 is estimated with
no confidence to be 224,576 rows. The estimated time for this
step is 1.00 seconds.
6) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 1.55 seconds.

2)

EXPLAIN select *
From vinod_5 a JOIN vinod_6 b
ON
a.Partition_Id = b.Partition_Id AND
a."1_Id" = b. "1_Id" AND
a."2_Id" = b. "2_Id" AND
-- a."3_Id" = b."3_Id" And COMMENTED
a."4_Id" = b."4_Id" ANd
--a."5_Id" = b."5_Id" AND COMMENTED
A.Partition_Id = 13

Explanation
1) First, we lock a distinct MSTRM600G."pseudo table" for read on a
RowHash to prevent global deadlock for MSTRM600G.b.
2) Next, we lock a distinct MSTRM600G."pseudo table" for read on a
RowHash to prevent global deadlock for MSTRM600G.a.
3) We lock MSTRM600G.b for read, and we lock MSTRM600G.a for read.
4) We do an all-AMPs RETRIEVE step from a single partition of
MSTRM600G.b with a condition of ("MSTRM600G.b.Partition_Id = 13")
with a residual condition of ("MSTRM600G.b.Partition_Id = 13")
into Spool 2 (all_amps), which is built locally on the AMPs. Then
we do a SORT to order Spool 2 by row hash. The size of Spool 2 is
estimated with no confidence to be 86,644 rows. The estimated
time for this step is 1.34 seconds.
5) We do an all-AMPs JOIN step from a single partition of MSTRM600G.a
with a condition of ("MSTRM600G.A.Partition_Id = 13") with a
residual condition of ("MSTRM600G.A.Partition_Id = 13"), which is
joined to Spool 2 (Last Use) by way of a RowHash match scan.
MSTRM600G.a and Spool 2 are joined using a merge join, with a join
condition of ("(MSTRM600G.a.Partition_Id = Partition_Id) AND
((MSTRM600G.a."1_Id" = "1_Id") AND ((MSTRM600G.a."2_Id" = "2_Id")
AND (MSTRM600G.a."4_Id" = "4_Id" )))"). The result goes into
Spool 1 (group_amps), which is built locally on the AMPs. The
size of Spool 1 is estimated with no confidence to be 277,255 rows.
The estimated time for this step is 1.09 seconds.
6) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 2.43 seconds.

=========================================================================

I was expecting both the explain plan similar.

Thanks,
1 REPLY
Enthusiast

Re: PPI explain

In the second query, we have conditions on exactly primary index and partition columns. With additional ANDed conditions in the first query it picked up a different plan.