Avoid Product Join Step

Database

Avoid Product Join Step

Hi,

Could you please suggest how to avoid this product join step and explain about this step.

Usually how should i understand this kind of steps all stats seems to be up to date on these columns.

17) We do an all-AMPs JOIN step from Spool 26 (Last Use) by way of an

all-rows scan, which is joined to Spool 27 (Last Use) by way of an

all-rows scan. Spool 26 and Spool 27 are joined using a product

join, with a join condition of ("(CLIENT_NBR = CLIENT_NBR) AND

(POP_ID = POP_ID)"). The result goes into Spool 15 (all_amps)

(compressed columns allowed), which is built locally on the AMPs.

The result spool file will not be cached in memory. The size of

Spool 15 is estimated with no confidence to be 252,638,840,002

rows (108,887,340,040,862 bytes). The estimated time for this

step is 3 hours and 36 minutes.

Thank you.




8 REPLIES

Re: Avoid Product Join Step

I will supply the lowest hanging fruit!  :-)

Statistics

N/A

Re: Avoid Product Join Step

As there's a good join-condition the product join was choosen by the optmizer cause it's the best way to get the result.

Did you check in DBQL if the estimated number of rows is actually in that range?

What are the steps producing spool 26 and 27?

Re: Avoid Product Join Step

Hi Dieter,

The actual rows are not in that range. And the stats on columns CLIENT_NBR and POP_ID are up to date.

Here are the steps for spool 26 and 27.

Spool 26:

   2) We do a group-AMP RETRIEVE step from Spool 2 (Last Use) by way

         of an all-rows scan with a condition of (

         "(population_client_user.USER_ID = 'XWL5KAM') AND

         ((population_client_user.CLIENT_NBR = 5610) AND

         (population_client_user.POP_ID = 0 ))") into Spool 26

         (all_amps) (compressed columns allowed), which is duplicated

         on all AMPs.  The size of Spool 26 is estimated with no

         confidence to be 2,841,072 rows (102,278,592 bytes).  The

         estimated time for this step is 0.08 seconds.

Spool 27:

16) We do an all-AMPs JOIN step from Spool 24 (Last Use) by way of a

     RowHash match scan, which is joined to Spool 25 (Last Use) by way

     of a RowHash match scan.  Spool 24 and Spool 25 are joined using a

     merge join, with a join condition of ("SK_HICL_CDE = SK_HICL_cde").

     The result goes into Spool 27 (all_amps) (compressed columns

     allowed), which is built locally on the AMPs.  The size of Spool

     27 is estimated with no confidence to be 110,976,868 rows (

     46,832,238,296 bytes).  The estimated time for this step is 7.10

     seconds.

Thank you.

N/A

Re: Avoid Product Join Step

Both steps show no confidence, are you sure that the necessary stats exist?

Re: Avoid Product Join Step

Hi Dieter,

The stats are up to date for these columns double checked what could be the issue.

is there any way to find stale stats on table.

Thank you.

N/A

Re: Avoid Product Join Step

You need to show the actual query plus explain to find out.

Re: Avoid Product Join Step

Hi

 

I have a query as below --> which is hitting 811299377 records but finally the output expected is just 40 records.

Row count :- DB1.TBL2 --> 811299377 & DB1.TBL1 --> 77

Total time taken for this query is 9 hours +

Please help to tune this query..

 

INSERT INTO DB1.TBL1
(
                COL1
,               COL2
,               COL3
,               COL4
,               COL5
)

WITH STG AS
(
SELECT
                COALESCE(TBL2.COL1,-1) AS COL1
,                SYSLIB.HASH_MD5(UPPER(TRIM(COALESCE(TBL2.COL1,-1))||'|')) AS COL2
,               'RIS' AS COL3
,               1 AS COL4
,               CURRENT_TIMESTAMP AS COL5
FROM DB1.TBL2
group by 1
)
SELECT
                STG.COL1,
    STG.COL2,
    STG.COL3,
    STG.COL4,
    STG.COL5
    
FROM    STG

WHERE STG.COL2 NOT IN
   (
                SELECT H.COL2
                FROM DB1.TBL1 H
                WHERE STG.COL2  = H.COL2
    GROUP BY H.COL2
   )
GROUP BY 1,2,3,4,5;

Re: Avoid Product Join Step

Hi,

 

To help tune this query we're really going to need to see the explain plan and the table definitions.

 

Can you post those please?

 

Cheers,

Dave

 

P.S. IMHO it is also be better to start a new topic for a new query/question.

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com