what are my options in tuning this kind of query. In real life what worked best for you

Analytics
Enthusiast

what are my options in tuning this kind of query. In real life what worked best for you

I have a huge table with billions of rows and 3+ tb data being LOJ to another similar though lesser in size table.
The pattern is something like

Sel .....
<

from
tb1 /* this is the real gorilla table with size 2 tb+*/
LOJ C2 /* runners up to Tb1 in size */
on

Where

and
Tb1.C5 not in ( < a huge list of values > )
and
Tb4.C12 not in ( < another huge list of values > )

Unless we run this extract at off peak hrs or ask other users to abort or borrow space from spool reserve, it spools out. It doesn't look like a ' AMP-Skew-Spool out' , since, after seeing space in all VPROCS for that db there seems not much deviation. So based on this - assume its an ( all AMP ) 'general spool out' problem.
I analyzed the query and found a JI can benefit but problem is , its being loaded daily.
Collecting stats on tb1 the monster table itself takes 1 hrs but helps the query efficiency .

Given this kind of situation - what would you recommend as a tuning approach to this type of query.What are my options in optimizing this query.
Also is there any way to optimize the NOT in part . If the NOT in part was a sub query , yes, I can use NOT exists but alas , its a huge 'go though' list .

8 REPLIES
Senior Apprentice

Re: what are my options in tuning this kind of query. In real life what worked best for you

"Unless we run this extract at off peak hrs or ask other users to abort or borrow space from spool reserve, it spools out."

"2646 No more spool space" or "2507 Out of spool space on disk"?

I'm asking because you can't "borrow space from spool reserve".
Are you running parallel sessions using the same user id?

"seeing space in all VPROCS for that db"
Did you check perm space?
This will not tell you about actual spool distribution, you need to look at peakspool in dbc.diskspace or at DBQL.

You should check explain if there's a "redistributed by hash code".
Could you post the explain, too?

If there's a large number of NULLs/default values in the join column, then splitting the query in two parts plus union all might help.

Dieter
Enthusiast

Re: what are my options in tuning this kind of query. In real life what worked best for you

Hi Dieter
Thanks a bunch for your sharp answers.
Here go the clarifications

"Unless we run this extract at off peak hrs or ask other users to abort or borrow space from spool reserve, it spools out."

"2646 No more spool space" or "2507 Out of spool space on disk"?
A--> it is 2646 No more spool space

I'm asking because you can't "borrow space from spool reserve".
Are you running parallel sessions using the same user id?
A--> Nope. That'd bill the same spool quota . So have ensured he runs just that 1 Query
"seeing space in all VPROCS for that db"
Did you check perm space?
This will not tell you about actual spool distribution, you need to look at peakspool in dbc.diskspace or at DBQL.
A--> Sum from diskspace is 5.3 TB

You should check explain if there's a "redistributed by hash code".
Could you post the explain, too?

If there's a large number of NULLs/default values in the join column, then splitting the query in two parts plus union all might help.

A--> Here is the explain in detail. This is the original Q. I changed in to make the tables smaller in size and in some cases convert the LOJ to ( IJ + LOJ )

1) FIRST, we LOCK HCCLPNJ_T.ACC_LOG_DTL_NI IN VIEW
HCCLNJ.ACC_LOG_DTL_ni FOR ACCESS, we LOCK HCCLPNJ_T.ACC_LOG_DTL_IX
IN VIEW HCCLNJ.ACC_LOG_DTL_ix FOR ACCESS, we LOCK
HCCLPNJ_T.PARTNER_EMP IN VIEW HCCLNJ.PARTNER_EMP FOR ACCESS, we
LOCK HCCLPNJ_T.REPORT_DETAILS IN VIEW HCCLNJ.REPORT_DETAILS FOR
ACCESS, we LOCK HCCLPNJ_T.EMP_MAP IN VIEW HCCLNJ.EMP_MAP FOR
ACCESS, we LOCK HCCLPNJ_T.ACCESS_LOG IN VIEW HCCLNJ.ACCESS_LOG FOR
ACCESS, AND we LOCK HCCLPNJ_T.ACCESS_LOG_METRIC IN VIEW
HCCLNJ.ACCESS_LOG_METRIC FOR ACCESS.
2) NEXT, we DO an ALL-AMPs RETRIEVE step FROM
HCCLPNJ_T.ACCESS_LOG_METRIC IN VIEW HCCLNJ.ACCESS_LOG_METRIC BY
way OF an ALL-ROWS scan WITH a CONDITION OF (
"HCCLPNJ_T.ACCESS_LOG_METRIC.METRIC_NAME = 'MR_REPORTS'") LOCKING
FOR READ INTO SPOOL 26 (all_amps) (compressed COLUMNS allowed),
which IS duplicated ON ALL AMPs. The SIZE OF SPOOL 26 IS
estimated WITH HIGH confidence TO be 592 ROWS (12,432 BYTES). The
estimated TIME FOR this step IS 0.05 seconds.
3) We DO an ALL-AMPs JOIN step FROM SPOOL 26 BY way OF an ALL-ROWS
scan, which IS joined TO HCCLPNJ_T.ACCESS_LOG IN VIEW
HCCLNJ.ACCESS_LOG BY way OF an ALL-ROWS scan WITH a CONDITION OF (
"(NOT (HCCLPNJ_T.ACCESS_LOG.WORKSTATION_ID IS NULL )) AND ((NOT
(HCCLPNJ_T.ACCESS_LOG.USER_ID IS NULL )) AND
((HCCLPNJ_T.ACCESS_LOG.ACCESS_TIME >= TIMESTAMP '2010-10-04
00:00:00') AND ((HCCLPNJ_T.ACCESS_LOG.ACCESS_TIME < TIMESTAMP
'2010-10-11 00:00:00') AND (HCCLPNJ_T.ACCESS_LOG.METRIC_ID <>
90020002. ))))"). SPOOL 26 AND HCCLPNJ_T.ACCESS_LOG are joined
USING a single PARTITION hash_ JOIN, WITH a JOIN CONDITION OF (
"HCCLPNJ_T.ACCESS_LOG.METRIC_ID = METRIC_ID"). The INPUT TABLE
HCCLPNJ_T.ACCESS_LOG will NOT be cached IN memory. The RESULT
goes INTO SPOOL 27 (all_amps) (compressed COLUMNS allowed), which
IS redistributed BY the HASH code OF (
HCCLPNJ_T.ACCESS_LOG.USER_ID) TO ALL AMPs. The SIZE OF SPOOL 27
IS estimated WITH LOW confidence TO be 336,579 ROWS (79,096,065
BYTES). The estimated TIME FOR this step IS 14 minutes AND 39
seconds.
4) We DO an ALL-AMPs RETRIEVE step FROM HCCLPNJ_T.EMP_MAP IN VIEW
HCCLNJ.EMP_MAP BY way OF an ALL-ROWS scan WITH NO residual
conditions LOCKING FOR READ INTO SPOOL 28 (all_amps) (compressed
COLUMNS allowed), which IS redistributed BY the HASH code OF (
HCCLPNJ_T.EMP_MAP.CID) TO ALL AMPs. THEN we DO a SORT TO ORDER
SPOOL 28 BY ROW HASH. The SIZE OF SPOOL 28 IS estimated WITH HIGH
confidence TO be 355,792 ROWS (7,471,632 BYTES). The estimated
TIME FOR this step IS 0.12 seconds.
5) We DO an ALL-AMPs JOIN step FROM HCCLPNJ_T.PARTNER_EMP IN VIEW
HCCLNJ.PARTNER_EMP BY way OF an ALL-ROWS scan WITH a CONDITION OF
("(HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16132411') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16188673821') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '19016187') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '19099913') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16118852') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16132413') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1619173') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1619539') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '130HI50102') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '130HI50101') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '130HI50100') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303627') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303457') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303484') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303513') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303625') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303483') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303588') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303487') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15034761') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15034763') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16113868') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15034762') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15017398') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1502916') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1506278') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHRES') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHPRE') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1213061') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHEIP') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHEH') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHEC') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHCAD') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1402040') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161999999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1403430') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15020125') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '88813001') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '17013001') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1803583') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '19017184') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '190109999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC3') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC6') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC7') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC4') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC5') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC8') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '150REGNSCM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '180OH1234') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '180REGNOHM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '180REGNNCM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1219999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '322PRODSBM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161PRODNCL') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '316PRODNBM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '8883563') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '3143185') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1219998') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '2202505') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16113431') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1209999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '110110') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '150TESTHIM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '314PRODGGM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '318PRODEBM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '320PRODCCM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '110DMDLEM2') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1002') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1012') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1003') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1010') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1006') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1007') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1009') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1004') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1005') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1011') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1008') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1001') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16122449') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15026890') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15026894') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1709781') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1304217') AND ((NOT
(HCCLPNJ_T.PARTNER_EMP.NAME LIKE '%INBOUND%')) AND ((NOT
(HCCLPNJ_T.PARTNER_EMP.NAME LIKE '%OUTBOUND%')) AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1304216') AND (NOT
(HCCLPNJ_T.PARTNER_EMP.NAME LIKE
'%INTERFACE%'))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
which is joined to Spool 27 (Last Use) by way of an all-rows scan.
HCCLPNJ_T.PARTNER_EMP and Spool 27 are joined using a single
partition hash join, with a join condition of ("USER_ID =
HCCLPNJ_T.PARTNER_EMP.USER_ID"). The result goes into Spool 29
(all_amps) (compressed columns allowed), which is built locally on
the AMPs. Then we do a SORT to order Spool 29 by the hash code of
(HCCLPNJ_T.PARTNER_EMP.USER_ID). The size of Spool 29 is
estimated with low confidence to be 336,579 rows (81,788,697
bytes). The estimated time for this step is 0.27 seconds.
6) We do an all-AMPs JOIN step from Spool 28 by way of a RowHash
match scan, which is joined to Spool 29 (Last Use) by way of a
RowHash match scan. Spool 28 and Spool 29 are joined using a
merge join, with a join condition of ("(USER_ID = CID) AND
(USER_ID = CID)"). The result goes into Spool 30 (all_amps)
(compressed columns allowed), which is redistributed by the hash
code of (HCCLPNJ_T.ACCESS_LOG.ACCESS_INSTANT,
HCCLPNJ_T.ACCESS_LOG.PROCESS_ID) to all AMPs. Then we do a SORT
to order Spool 30 by row hash. The size of Spool 30 is estimated
with low confidence to be 336,579 rows (79,096,065 bytes). The
estimated time for this step is 0.16 seconds.
7) We do an all-AMPs JOIN step from HCCLPNJ_T.ACC_LOG_DTL_IX in view
HCCLNJ.ACC_LOG_DTL_ix by way of a RowHash match scan with a
condition of ("NOT (HCCLPNJ_T.ACC_LOG_DTL_IX.STRING_VALUE IS NULL)"),
which is joined to Spool 30 (Last Use) by way of a RowHash match
scan. HCCLPNJ_T.ACC_LOG_DTL_IX and Spool 30 are joined using a
merge join, with a join condition of (
"(HCCLPNJ_T.ACC_LOG_DTL_IX.ACCESS_INSTANT = ACCESS_INSTANT) AND
(HCCLPNJ_T.ACC_LOG_DTL_IX.PROCESS_ID = PROCESS_ID)"). The input
table HCCLPNJ_T.ACC_LOG_DTL_IX will not be cached in memory, but
it is eligible for synchronized scanning. The result goes into
Spool 31 (all_amps) (compressed columns allowed), which is
redistributed by the hash code of (
HCCLPNJ_T.ACC_LOG_DTL_IX.STRING_VALUE) to all AMPs. The size of
Spool 31 is estimated with no confidence to be 448,597 rows (
410,466,255 bytes). The estimated time for this step is 10.59
seconds.
8) We do an all-AMPs JOIN step from HCCLPNJ_T.REPORT_DETAILS in view
HCCLNJ.REPORT_DETAILS by way of an all-rows scan with a condition
of ("HCCLPNJ_T.REPORT_DETAILS.REPORT_NAME <> 'IN BASKET: STANDARD
HEADER'"), which is joined to Spool 31 (Last Use) by way of an
all-rows scan. HCCLPNJ_T.REPORT_DETAILS and Spool 31 are joined
using a single partition hash join, with a join condition of (
"STRING_VALUE = HCCLPNJ_T.REPORT_DETAILS.LRP_ID"). The result
goes into Spool 25 (group_amps), which is built locally on the
AMPs. The size of Spool 25 is estimated with no confidence to be
448,469 rows (903,665,035 bytes). The estimated time for this
step is 0.31 seconds.
9) We do an all-AMPs RETRIEVE step from HCCLPNJ_T.REPORT_DETAILS in
view HCCLNJ.REPORT_DETAILS by way of an all-rows scan with a
condition of ("HCCLPNJ_T.REPORT_DETAILS.REPORT_NAME <> 'IN BASKET:
STANDARD HEADER'") locking for access into Spool 32 (all_amps)
(compressed columns allowed), which is duplicated on all AMPs.
The size of Spool 32 is estimated with high confidence to be
2,058,976 rows (43,238,496 bytes). The estimated time for this
step is 0.30 seconds.
10) We do an all-AMPs JOIN step from Spool 32 (Last Use) by way of an
all-rows scan, which is joined to HCCLPNJ_T.ACC_LOG_DTL_NI in view
HCCLNJ.ACC_LOG_DTL_ni by way of an all-rows scan locking
HCCLPNJ_T.ACC_LOG_DTL_NI for read. Spool 32 and
HCCLPNJ_T.ACC_LOG_DTL_NI are joined using a single partition hash_
join, with a join condition of (
"HCCLPNJ_T.ACC_LOG_DTL_NI.STRING_VALUE = LRP_ID"). The input
table HCCLPNJ_T.ACC_LOG_DTL_NI will not be cached in memory. The
result goes into Spool 33 (all_amps) (compressed columns allowed),
which is built locally on the AMPs. Then we do a SORT to order
Spool 33 by the hash code of (
HCCLPNJ_T.ACC_LOG_DTL_NI.ACCESS_INSTANT,
HCCLPNJ_T.ACC_LOG_DTL_NI.PROCESS_ID). The size of Spool 33 is
estimated with low confidence to be 15,292,523 rows (
2,171,538,266 bytes). The estimated time for this step is 5
minutes and 37 seconds.
11) We do an all-AMPs JOIN step from HCCLPNJ_T.ACCESS_LOG in view
HCCLNJ.ACCESS_LOG by way of a RowHash match scan with a condition
of ("(NOT (HCCLPNJ_T.ACCESS_LOG.WORKSTATION_ID IS NULL )) AND
((NOT (HCCLPNJ_T.ACCESS_LOG.USER_ID IS NULL )) AND ((NOT
(HCCLPNJ_T.ACCESS_LOG.METRIC_ID IS NULL )) AND
((HCCLPNJ_T.ACCESS_LOG.ACCESS_TIME >= TIMESTAMP '2010-10-04
00:00:00') AND ((HCCLPNJ_T.ACCESS_LOG.ACCESS_TIME < TIMESTAMP
'2010-10-11 00:00:00') AND (HCCLPNJ_T.ACCESS_LOG.METRIC_ID <>
90020002. )))))"), which is joined to Spool 33 (Last Use) by way
of a RowHash match scan locking HCCLPNJ_T.ACCESS_LOG for access.
HCCLPNJ_T.ACCESS_LOG and Spool 33 are joined using a merge join,
with a join condition of ("(ACCESS_INSTANT =
HCCLPNJ_T.ACCESS_LOG.ACCESS_INSTANT) AND (PROCESS_ID =
HCCLPNJ_T.ACCESS_LOG.PROCESS_ID)"). The input table
HCCLPNJ_T.ACCESS_LOG will not be cached in memory, but it is
eligible for synchronized scanning. The result goes into Spool 35
(all_amps) (compressed columns allowed), which is built locally on
the AMPs. The size of Spool 35 is estimated with low confidence
to be 15,292,523 rows (4,128,981,210 bytes). The estimated time
for this step is 4 minutes and 53 seconds.
12) We do an all-AMPs JOIN step from Spool 26 (Last Use) by way of an
all-rows scan, which is joined to Spool 35 (Last Use) by way of an
all-rows scan. Spool 26 and Spool 35 are joined using a single
partition hash join, with a join condition of ("METRIC_ID =
METRIC_ID"). The result goes into Spool 37 (all_amps) (compressed
columns allowed), which is redistributed by the hash code of (
HCCLPNJ_T.ACCESS_LOG.USER_ID) to all AMPs. Then we do a SORT to
order Spool 37 by row hash. The size of Spool 37 is estimated
with low confidence to be 25,964 rows (7,010,280 bytes). The
estimated time for this step is 0.88 seconds.
13) We do an all-AMPs JOIN step from Spool 28 (Last Use) by way of a
RowHash match scan, which is joined to Spool 37 (Last Use) by way
of a RowHash match scan. Spool 28 and Spool 37 are joined using a
merge join, with a join condition of ("USER_ID = CID"). The
result goes into Spool 38 (all_amps) (compressed columns allowed),
which is built locally on the AMPs. The size of Spool 38 is
estimated with low confidence to be 25,964 rows (7,217,992 bytes).
The estimated time for this step is 0.08 seconds.
14) We do an all-AMPs JOIN step from HCCLPNJ_T.PARTNER_EMP in view
HCCLNJ.PARTNER_EMP by way of a RowHash match scan with a condition
of ("(HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16132411') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16188673821') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '19016187') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '19099913') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16118852') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16132413') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1619173') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1619539') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '130HI50102') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '130HI50101') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '130HI50100') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303627') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303457') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303484') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303513') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303625') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303483') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303588') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1303487') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15034761') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15034763') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16113868') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15034762') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15017398') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1502916') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1506278') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHRES') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHPRE') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1213061') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHEIP') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHEH') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHEC') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '140BATCHCAD') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1402040') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161999999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1403430') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15020125') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '88813001') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '17013001') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1803583') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '19017184') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '190109999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC3') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC6') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC7') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC4') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC5') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '121TRNGSC8') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '150REGNSCM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '180OH1234') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '180REGNOHM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '180REGNNCM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1219999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '322PRODSBM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161PRODNCL') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '316PRODNBM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '8883563') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '3143185') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1219998') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '2202505') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16113431') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1209999') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '110110') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '150TESTHIM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '314PRODGGM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '318PRODEBM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '320PRODCCM') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '110DMDLEM2') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1002') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1012') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1003') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1010') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1006') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1007') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1009') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1004') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1005') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1011') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1008') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '161COE1001') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '16122449') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15026890') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '15026894') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1709781') AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1304217') AND ((NOT
(HCCLPNJ_T.PARTNER_EMP.NAME LIKE '%INBOUND%')) AND ((NOT
(HCCLPNJ_T.PARTNER_EMP.NAME LIKE '%OUTBOUND%')) AND
((HCCLPNJ_T.PARTNER_EMP.USER_ID <> '1304216') AND (NOT
(HCCLPNJ_T.PARTNER_EMP.NAME LIKE
'%INTERFACE%'))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
which IS joined TO SPOOL 38 (LAST USE) BY way OF a RowHash match
scan LOCKING HCCLPNJ_T.PARTNER_EMP FOR ACCESS.
HCCLPNJ_T.PARTNER_EMP AND SPOOL 38 are joined USING a MERGE JOIN,
WITH a JOIN CONDITION OF ("(HCCLPNJ_T.PARTNER_EMP.USER_ID = CID)
AND (USER_ID = HCCLPNJ_T.PARTNER_EMP.USER_ID)"). The RESULT goes
INTO SPOOL 25 (group_amps), which IS built locally ON the AMPs.
The SIZE OF SPOOL 25 IS estimated WITH NO confidence TO be 474,433
ROWS (955,982,495 BYTES). The estimated TIME FOR this step IS
0.22 seconds.
15) Finally, we send OUT an END TRANSACTION step TO ALL AMPs involved
IN processing the REQUEST.
-> The contents OF SPOOL 25 are sent back TO the USER AS the RESULT
OF STATEMENT 1. The total estimated TIME IS 25 minutes AND 22
seconds.
Senior Apprentice

Re: what are my options in tuning this kind of query. In real life what worked best for you

In your first post you wrote about outer joining 2 tables, but your explains locks 8 tables and there's no outer join in it.

Could you post the *original* query and explain.

Did you check the query log which step is actually using that much spool?

Dieter
Enthusiast

Re: what are my options in tuning this kind of query. In real life what worked best for you

Hi Dieter
Thx
It does have 2 LOJ's . The above explain refers to the following orignal query as below.
EXPLAIN SELECT CAST(

(CASE WHEN AL.ACCESS_INSTANT IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(AL.ACCESS_INSTANT AS CHAR(23)))) ELSE '' END) ||' '||
(CASE WHEN AL.PROCESS_ID IS NOT NULL THEN TRIM(CAST(AL.PROCESS_ID AS CHAR(254))) ELSE '' END) ||' '||
(CASE WHEN AL.PROCESS_ID IS NOT NULL THEN TRIM(CAST(SUBSTR(AL.PROCESS_ID,1,POSITION('-' IN AL.PROCESS_ID)-1) AS CHAR(254))) ELSE '' END) ||' '||
(CASE WHEN AL.ACCESS_TIME IS NOT NULL THEN TRIM(CAST(CAST(AL.ACCESS_TIME AS TIMESTAMP(0) FORMAT 'MM/DD/YYYYbHH:MI:SS' ) AS CHAR(19))) ELSE '' END) ||' '||
(CASE WHEN AL.METRIC_ID IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(AL.METRIC_ID AS CHAR(20)))) ELSE '' END) ||' '||
(CASE WHEN AL.USER_ID IS NOT NULL THEN TRIM(CAST(AL.USER_ID AS CHAR(18))) ELSE '' END) ||' '||
(CASE WHEN AL.WORKSTATION_ID IS NOT NULL THEN TRIM(CAST(AL.WORKSTATION_ID AS CHAR(254))) ELSE '' END) ||' '||
(CASE WHEN AL.PAT_ID IS NOT NULL THEN TRIM(CAST(AL.PAT_ID AS CHAR(18))) ELSE '' END) ||' '||
(CASE WHEN AL.CSN IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(AL.CSN AS CHAR(20)))) ELSE '' END) ||' '||
(CASE WHEN ALD.DATA_MNEMONIC_ID IS NOT NULL THEN TRIM(CAST( ALD.DATA_MNEMONIC_ID AS CHAR(8))) ELSE '' END) ||' '||
(CASE WHEN ALD.STRING_VALUE IS NOT NULL THEN TRIM(CAST( ALD.STRING_VALUE AS CHAR(2000))) ELSE '' END) ||' '||
(CASE WHEN ALD.INTEGER_VALUE IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(ALD.INTEGER_VALUE AS CHAR(20)))) ELSE '' END)

AS CHAR(1000) )
FROM HCCLNJ.ACCESS_LOG AL
LEFT OUTER JOIN HCCLNJ.ACC_LOG_DTL_ix ALD
ON ( ALD.ACCESS_INSTANT = AL.ACCESS_INSTANT
AND ALD.PROCESS_ID = AL.PROCESS_ID )

INNER JOIN HCCLNJ.EMP_MAP EMAP
ON ( AL.USER_ID = EMAP.CID )
AND AL.METRIC_ID <> '90020002'
AND AL.WORKSTATION_ID IS NOT NULL
AND (AL.ACCESS_TIME >= '2010-10-04 00:00:00'
AND AL.ACCESS_TIME < '2010-10-11 00:00:00')

INNER JOIN HCCLNJ.PARTNER_EMP EMP
ON ( EMP.USER_ID = EMAP.CID
AND EMP.USER_ID NOT IN (
'1304216', '1304217', '1709781', '15026894',
'15026890', '16122449', '161COE1001', '161COE1008',
'161COE1011', '161COE1005', '161COE1004', '161COE1009',
'161COE1007', '161COE1006', '161COE1010', '161COE1003',
'161COE1012', '161COE1002',

'110DMDLEM2', '320PRODCCM', '318PRODEBM', '314PRODGGM',
'150TESTHIM', '110110', '1209999', '16113431', '2202505',
'1219998', '3143185', '8883563', '316PRODNBM', '161PRODNCL',
'322PRODSBM', '1219999', '180REGNNCM', '180REGNOHM', '180OH1234',
'150REGNSCM', '121TRNGSC8', '121TRNGSC5', '121TRNGSC4',
'121TRNGSC7', '121TRNGSC6', '121TRNGSC3', '190109999', '19017184',
'1803583', '17013001', '88813001', '15020125', '1403430',

'161999999', '1402040', '140BATCHCAD', '140BATCHEC', '140BATCHEH',
'140BATCHEIP', '1213061', '140BATCHPRE', '140BATCHRES', '1506278',
'1502916', '15017398', '15034762', '16113868', '15034763', '15034761',
'1303487', '1303588', '1303483', '1303625', '1303513', '1303484',
'1303457', '1303627', '130HI50100', '130HI50101', '130HI50102',
'1619539', '1619173', '16132411', '16132413', '16118852', '19099913',
'19016187',

'16188673821'
))

INNER JOIN HCCLNJ.ACCESS_LOG_METRIC ALM
ON ( AL.METRIC_ID = ALM.METRIC_ID )

LEFT OUTER JOIN HCCLNJ.REPORT_DETAILS RD
ON STRING_VALUE = RD.LRP_ID
AND ALM.METRIC_NAME = 'MR_REPORTS'

WHERE EMP.NAME NOT LIKE ALL ('%INTERFACE%', '%INBOUND%', '%OUTBOUND%')
AND RD.REPORT_NAME <> 'IN BASKET: STANDARD HEADER'

UNION ALL

SELECT CAST(
(CASE WHEN AL.ACCESS_INSTANT IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(AL.ACCESS_INSTANT AS CHAR(23)))) ELSE '' END) ||' '||
(CASE WHEN AL.PROCESS_ID IS NOT NULL THEN TRIM(CAST(AL.PROCESS_ID AS CHAR(254))) ELSE '' END) ||' '||
(CASE WHEN AL.PROCESS_ID IS NOT NULL THEN TRIM(CAST(SUBSTR(AL.PROCESS_ID,1,POSITION('-' IN AL.PROCESS_ID)-1) AS CHAR(254))) ELSE '' END) ||' '||
(CASE WHEN AL.ACCESS_TIME IS NOT NULL THEN TRIM(CAST(CAST(AL.ACCESS_TIME AS TIMESTAMP(0) FORMAT 'MM/DD/YYYYbHH:MI:SS' ) AS CHAR(19))) ELSE '' END) ||' '||
(CASE WHEN AL.METRIC_ID IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(AL.METRIC_ID AS CHAR(20)))) ELSE '' END) ||' '||
(CASE WHEN AL.USER_ID IS NOT NULL THEN TRIM(CAST(AL.USER_ID AS CHAR(18))) ELSE '' END) ||' '||
(CASE WHEN AL.WORKSTATION_ID IS NOT NULL THEN TRIM(CAST(AL.WORKSTATION_ID AS CHAR(254))) ELSE '' END) ||' '||
(CASE WHEN AL.PAT_ID IS NOT NULL THEN TRIM(CAST(AL.PAT_ID AS CHAR(18))) ELSE '' END) ||' '||
(CASE WHEN AL.CSN IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(AL.CSN AS CHAR(20)))) ELSE '' END) ||' '||
(CASE WHEN ALD.DATA_MNEMONIC_ID IS NOT NULL THEN TRIM(CAST( ALD.DATA_MNEMONIC_ID AS CHAR(8))) ELSE '' END) ||' '||
(CASE WHEN ALD.STRING_VALUE IS NOT NULL THEN TRIM(CAST( ALD.STRING_VALUE AS CHAR(2000))) ELSE '' END) ||' '||
(CASE WHEN ALD.INTEGER_VALUE IS NOT NULL THEN TRIM(TRAILING '.' FROM TRIM(CAST(ALD.INTEGER_VALUE AS CHAR(20)))) ELSE '' END)

AS CHAR(1000))
FROM HCCLNJ.ACCESS_LOG AL
LEFT OUTER JOIN HCCLNJ.ACC_LOG_DTL_ni ALD
ON ( ALD.ACCESS_INSTANT = AL.ACCESS_INSTANT
AND ALD.PROCESS_ID = AL.PROCESS_ID )

INNER JOIN HCCLNJ.EMP_MAP EMAP
ON ( AL.USER_ID = EMAP.CID )
AND AL.METRIC_ID <> '90020002'
AND AL.WORKSTATION_ID IS NOT NULL
AND (AL.ACCESS_TIME >= '2010-10-04 00:00:00'
AND AL.ACCESS_TIME < '2010-10-11 00:00:00')

INNER JOIN HCCLNJ.PARTNER_EMP EMP
ON ( EMP.USER_ID = EMAP.CID
AND EMP.USER_ID NOT IN (
'1304216', '1304217', '1709781', '15026894',
'15026890', '16122449', '161COE1001', '161COE1008',
'161COE1011', '161COE1005', '161COE1004', '161COE1009',
'161COE1007', '161COE1006', '161COE1010', '161COE1003',
'161COE1012', '161COE1002',

'110DMDLEM2', '320PRODCCM', '318PRODEBM', '314PRODGGM',
'150TESTHIM', '110110', '1209999', '16113431', '2202505',
'1219998', '3143185', '8883563', '316PRODNBM', '161PRODNCL',
'322PRODSBM', '1219999', '180REGNNCM', '180REGNOHM', '180OH1234',
'150REGNSCM', '121TRNGSC8', '121TRNGSC5', '121TRNGSC4',
'121TRNGSC7', '121TRNGSC6', '121TRNGSC3', '190109999', '19017184',
'1803583', '17013001', '88813001', '15020125', '1403430',

'161999999', '1402040', '140BATCHCAD', '140BATCHEC', '140BATCHEH',
'140BATCHEIP', '1213061', '140BATCHPRE', '140BATCHRES', '1506278',
'1502916', '15017398', '15034762', '16113868', '15034763', '15034761',
'1303487', '1303588', '1303483', '1303625', '1303513', '1303484',
'1303457', '1303627', '130HI50100', '130HI50101', '130HI50102',
'1619539', '1619173', '16132411', '16132413', '16118852', '19099913',
'19016187',

'16188673821'
))

INNER JOIN HCCLNJ.ACCESS_LOG_METRIC ALM
ON ( AL.METRIC_ID = ALM.METRIC_ID )

LEFT OUTER JOIN HCCLNJ.REPORT_DETAILS RD
ON ALD.STRING_VALUE = RD.LRP_ID
AND ALM.METRIC_NAME = 'MR_REPORTS'

WHERE EMP.NAME NOT LIKE ALL ('%INTERFACE%', '%INBOUND%', '%OUTBOUND%')
AND RD.REPORT_NAME <> 'IN BASKET: STANDARD HEADER';
Enthusiast

Re: what are my options in tuning this kind of query. In real life what worked best for you

its a regular nested views structure so , in the explain "_t" refers to the corresponding view's base table.
Senior Apprentice

Re: what are my options in tuning this kind of query. In real life what worked best for you

You wrote Outer Join, but the result is the same as an Inner Join, due to wrong join order and wrong where-condition. The optimizer is smart enough to recognize that and replaces outer with inner joins.
You should read the "outer join case study" in the DML SQL manual.

I repeat my question:
Did you check the query log which step is actually using that much spool?

According to explain the maximum estimated spool size is 4GB+, which is far away from 5.3TB

And another question, why do you cast a value to a char and then trim it to a varchar again (instead of directly casting to a varchar)? And finally cast it to a char(1000)?

Dieter

Enthusiast

Re: what are my options in tuning this kind of query. In real life what worked best for you

Hi Dieter
This is the developers artwork. Unless I get the business requirements of the query , I can only conjecture. From what I see, he is trying to Fastexport data using '|' as a separator, because you cannot use an elective separator in Fastexport . Notice that he trims to char and not varchar . He finally concatenates everything to a char(1000) to give it a fixed length record.

<> Is there any way to optimize this" not in " part
AND EMP.USER_ID NOT IN (
'1304216', '1304217', '1709781', '15026894',
'15026890', '16122449', '161COE1001', '161COE1008',
'161COE1011', '161COE1005', '161COE1004', '161COE1009',
'161COE1007', '161COE1006', '161COE1010', '161COE1003',
'161COE1012', '161COE1002',

'110DMDLEM2', '320PRODCCM', '318PRODEBM', '314PRODGGM',
'150TESTHIM', '110110', '1209999', '16113431', '2202505',
'1219998', '3143185', '8883563', '316PRODNBM', '161PRODNCL',
'322PRODSBM', '1219999', '180REGNNCM', '180REGNOHM', '180OH1234',
'150REGNSCM', '121TRNGSC8', '121TRNGSC5', '121TRNGSC4',
'121TRNGSC7', '121TRNGSC6', '121TRNGSC3', '190109999', '19017184',
'1803583', '17013001', '88813001', '15020125', '1403430',

'161999999', '1402040', '140BATCHCAD', '140BATCHEC', '140BATCHEH',
'140BATCHEIP', '1213061', '140BATCHPRE', '140BATCHRES', '1506278',
'1502916', '15017398', '15034762', '16113868', '15034763', '15034761',
'1303487', '1303588', '1303483', '1303625', '1303513', '1303484',
'1303457', '1303627', '130HI50100', '130HI50101', '130HI50102',
'1619539', '1619173', '16132411', '16132413', '16118852', '19099913',
'19016187',

'16188673821'
))

<> Any way he can do a block level export without using the above '|' approach so he can get delimited format data . ( e.g. '|' delimited)
Thx in advance
D
Senior Apprentice

Re: what are my options in tuning this kind of query. In real life what worked best for you

Why do you think the NOT IN is causing a problem?
It's just a where-condition.

FastExport can't create VARTEXT, so the concat '|' is the right approach.
But the final cast to char adds trailing blanks, which are actually written to the export file. This is done to avoid the two byte varchar length at the beginning of the reord.
The common solution to this problem is an Outmod remving the first two bytes, this is really easy to program.

Dieter
Dieter