We are doing insert into select from source to back up table.Select part is running fine ,where are when doing insert it is taking time and using high CPU. Based on below parameters and explain,request you to provide suggestion for tuning.
Explain INSERT INTO ARCH.GLOBAL_BKP
WHERE ( PERD_ID/100) IN (2013,2014)
This query is optimized using type 2 profile DR148674, profileid 10001.
1) First, we lock a distinct ARCH."pseudo table" for write on
a RowHash to prevent global deadlock for
2) Next, we lock ARCH.GLOBAL_BKP_X in view
ARCH.GLOBAL_BKP for write, and we lock
ODS.GLOBAL_SOURCE_X in view
ODS.GLOBAL_SOURCE for access.
3) We do an all-AMPs RETRIEVE step from ODS.GLOBAL_SOURCE_X
in view ODS.GLOBAL_SOURCE by way of an all-rows scan
with a condition of ("((ODS.GLOBAL_SOURCE_X in view
ODS.GLOBAL_SOURCE.PERD_ID / 100 )= 2013) OR
((ODS.GLOBAL_SOURCE_X in view
ODS.GLOBAL_SOURCE.PERD_ID / 100 )= 2014)") into Spool
1 (all_amps) (compressed columns allowed), which is built locally
on the AMPs. The input table will not be cached in memory, but it
is eligible for synchronized scanning. The size of Spool 1 is
estimated with no confidence to be 43,523,644 rows (
189,023,185,892 bytes). The estimated time for this step is 9
minutes and 37 seconds.
4) We do an all-AMPs MERGE into ARCH.GLOBAL_BKP_X in view
ARCH.GLOBAL_BKP from Spool 1 (Last Use).
The size is estimated with no confidence to be 43,523,644 rows.
The estimated time for this step is 5 hours and 55 minutes.
5) We spoil the parser's dictionary cache for the table.
6) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
Of course you don't need any statistics for a simple INSERT/SELECT, but if you want more accurate estimates you must change the WHERE-condition, e.g.
WHERE PERD_ID between 201300 and 201499)
If both tables are MultiSet the high CPU can't be due to duplicate rows checks, but the target table might be compressed (BLC or lots of ALC/MVC columns).
What's the actual number of rows and CPU?
Record count :
ARCH.GLOBAL_BKP : 284,167,734
ODS.GLOBAL_SOURCE(WITH ( PERD_ID/100) IN (2013,2014)) : 205,975,537 (RECORDS THAT ARE GOING TO BE INSERTED)
After using between ( PERD_ID) BETWEEN 201300 AND 201499 ,explain is showing accurate with high confidence.Also Merge step hours are increased.
THE ESTIMATED TIME FOR THIS STEP IS 27 HOURS AND 59 MINUTES
(I have no access to run the insert to check the accurate time it is taking now)
3) WE DO AN ALL-AMPS RETRIEVE STEP FROM 6171 PARTITIONS OF
ODS.GLOBAL_SOURCE_X IN VIEW
ODS.GLOBAL_SOURCE WITH A CONDITION OF (
"ODS.GLOBAL_SOURCE_X IN VIEW
ODS.GLOBAL_SOURCE.PERD_ID <= 201499) AND
ODS.GLOBAL_SOURCE_X IN VIEW
ODS.GLOBAL_SOURCE.PERD_ID >= 201300)") INTO SPOOL 1
(ALL_AMPS) (COMPRESSED COLUMNS ALLOWED), WHICH IS BUILT LOCALLY ON
THE AMPS. THE INPUT TABLE WILL NOT BE CACHED IN MEMORY, BUT IT IS
ELIGIBLE FOR SYNCHRONIZED SCANNING. THE RESULT SPOOL FILE WILL
NOT BE CACHED IN MEMORY. THE SIZE OF SPOOL 1 IS ESTIMATED WITH
HIGH CONFIDENCE TO BE 205,975,537 ROWS (894,551,757,191 BYTES).
THE ESTIMATED TIME FOR THIS STEP IS 32 MINUTES AND 10 SECONDS.
4) WE DO AN ALL-AMPS MERGE INTO ARCH.GLOBAL_BKP_X IN VIEW
ODS.GLOBAL_SOURCE FROM SPOOL 1 (LAST USE).
THE SIZE IS ESTIMATED WITH HIGH CONFIDENCE TO BE 205,975,537 ROWS.
THE ESTIMATED TIME FOR THIS STEP IS 27 HOURS AND 59 MINUTES.