Merge vs Insert-Select

Database
Enthusiast

Merge vs Insert-Select

I am trying to insert into Table A from B & C all having same PI.Please let me know difference in 2 approaches .Generally when PI are same it shud be all AMP merge with no use of spool,but here spool was used.Also how merge operation doesn't use any spool????

INSERT INTO A
SELECT
*
FROM
B /* THE DAILY TABLE */

;INSERT INTO A
SELECT *
FROM
C

Explanation

5) We do an all-AMPs RETRIEVE step from
B by way of an all-rows scan
with no residual conditions into Spool 1 (all_amps), which is
built locally on the AMPs. The size of Spool 1 is estimated with
high confidence to be 992,142 rows (127,986,318 bytes). The
estimated time for this step is 0.05 seconds.
6) We do an all-AMPs RETRIEVE step from
C by way of an all-rows
scan with no residual conditions into Spool 1 (all_amps), which is
built locally on the AMPs. The size of Spool 1 is estimated with
high confidence to be 682,647,519 rows (88,061,529,951 bytes).
The estimated time for this step is 18.36 seconds.
7) We do a SORT to order Spool 1 by row hash.
8) We do an all-AMPs MERGE into
A from Spool 1 (Last
Use). The size is estimated with high confidence to be
682,647,519 rows. The estimated time for this step is 2 hours and
21 minutes.
9) We spoil the parser's dictionary cache for the table.
10) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
No rows are returned to the user as the result of statement 2.

Modified Query

merge into A a
using B b
on(
a.EVENT_ID=b.EVENT_ID and
a.WTN_NPA_CD=b.WTN_NPA_CD and
a.WTN_NXX_CD=b.WTN_NXX_CD and
a.WTN_LINE_NBR=b.WTN_LINE_NBR and
a.CIRCUIT_NBR=b.CIRCUIT_NBR
)
when not matched then insert
(
EVENT_ID
,PRODUCT_ID
,WTN_NPA_CD
,WTN_NXX_CD
,WTN_LINE_NBR
,CIRCUIT_NBR
,USOC
,PRODUCT_INDICATOR_CD
,SUBCLASS_OF_SERVICE_CD
,INWARD_OUTWARD_CD
,PRODUCT_ACTION_CD
,DATA_TRANSMIT_SPEED_TXT
,ORIGINATING_PRODUCT_SALES_CD
,SEQUENCE_NBR
,ITEM_QTY
,CONTRACT_ID
,CONTRACT_START_DT
,CONTRACT_EXPIRATION_DT
,CONTRACT_TERM_DUR
,ACTION_NBR
,DATA_SOURCE_CD
,PRG_UNIT_FLAG_IND /* ADDED 4/2006 */
,DW_LOAD_DT
,DW_LOAD_TM
)
values
(
b.EVENT_ID
,b.PRODUCT_ID
,b.WTN_NPA_CD
,b.WTN_NXX_CD
,b.WTN_LINE_NBR
,b.CIRCUIT_NBR
,b.USOC
,b.PRODUCT_INDICATOR_CD
,b.SUBCLASS_OF_SERVICE_CD
,b.INWARD_OUTWARD_CD
,b.PRODUCT_ACTION_CD
,b.DATA_TRANSMIT_SPEED_TXT
,b.ORIGINATING_PRODUCT_SALES_CD
,b.SEQUENCE_NBR
,b.ITEM_QTY
,b.CONTRACT_ID
,b.CONTRACT_START_DT
,b.CONTRACT_EXPIRATION_DT
,b.CONTRACT_TERM_DUR
,b.ACTION_NBR
,b.DATA_SOURCE_CD
,b.PRG_UNIT_FLAG_IND /* ADDED 4/2006 */
,DATE
,TIME )

; merge into A a
using C b
on
(
a.EVENT_ID=b.EVENT_ID and
a.WTN_NPA_CD=b.WTN_NPA_CD and
a.WTN_NXX_CD=b.WTN_NXX_CD and
a.WTN_LINE_NBR=b.WTN_LINE_NBR and
a.CIRCUIT_NBR=b.CIRCUIT_NBR
)
when not matched then insert
(
EVENT_ID
,PRODUCT_ID
,WTN_NPA_CD
,WTN_NXX_CD
,WTN_LINE_NBR
,CIRCUIT_NBR
,USOC
,PRODUCT_INDICATOR_CD
,SUBCLASS_OF_SERVICE_CD
,INWARD_OUTWARD_CD
,PRODUCT_ACTION_CD
,DATA_TRANSMIT_SPEED_TXT
,ORIGINATING_PRODUCT_SALES_CD
,SEQUENCE_NBR
,ITEM_QTY
,CONTRACT_ID
,CONTRACT_START_DT
,CONTRACT_EXPIRATION_DT
,CONTRACT_TERM_DUR
,ACTION_NBR
,DATA_SOURCE_CD
,PRG_UNIT_FLAG_IND /* ADDED 4/2006 */
,DW_LOAD_DT
,DW_LOAD_TM
)
values
(
b.EVENT_ID
,b.PRODUCT_ID
,b.WTN_NPA_CD
,b.WTN_NXX_CD
,b.WTN_LINE_NBR
,b.CIRCUIT_NBR
,b.USOC
,b.PRODUCT_INDICATOR_CD
,b.SUBCLASS_OF_SERVICE_CD
,b.INWARD_OUTWARD_CD
,b.PRODUCT_ACTION_CD
,b.DATA_TRANSMIT_SPEED_TXT
,b.ORIGINATING_PRODUCT_SALES_CD
,b.SEQUENCE_NBR
,b.ITEM_QTY
,b.CONTRACT_ID
,b.CONTRACT_START_DT
,b.CONTRACT_EXPIRATION_DT
,b.CONTRACT_TERM_DUR
,b.ACTION_NBR
,b.DATA_SOURCE_CD
,b.PRG_UNIT_FLAG_IND /* ADDED 4/2006 */
,DATE
,TIME )

Explanation
1) First, we lock a distinct NEMO_WORKING."pseudo table" for read on
a RowHash to prevent global deadlock for
B.
2) Next, we lock a distinct SERVICE_ORDER."pseudo table" for write on
a RowHash to prevent global deadlock for
A.
3) We lock a distinct SERVICE_ORDER."pseudo table" for read on a
RowHash to prevent global deadlock for
C.
4) We lock B for read, we lock
A for write, and we
lock C for read.
5) We do an all-AMPs merge with unmatched inserts into
A from
B with a condition of (
"(A.EVENT_ID =
B.EVENT_ID) AND
((A.WTN_NPA_CD =
B.WTN_NPA_CD) AND
((A.WTN_NXX_CD =
B.WTN_NXX_CD) AND
((A.WTN_LINE_NBR =
B.WTN_LINE_NBR) AND
(A.CIRCUIT_NBR =
B.CIRCUIT_NBR ))))"). The
number of rows merged is estimated with low confidence to be
992,142 rows.
6) We do an all-AMPs merge with unmatched inserts into
A from
C with a condition of (
"(A.EVENT_ID =
C.EVENT_ID) AND
((A.WTN_NPA_CD =
C.WTN_NPA_CD) AND
((A.WTN_NXX_CD =
C.WTN_NXX_CD) AND
((A.WTN_LINE_NBR =
C.WTN_LINE_NBR) AND
(A.CIRCUIT_NBR =
C.CIRCUIT_NBR ))))").
The number of rows merged is estimated with low confidence to be
682,647,519 rows.
7) We spoil the parser's dictionary cache for the table.
8) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
No rows are returned to the user as the result of statement
1 REPLY
Enthusiast

Re: Merge vs Insert-Select

Currently, the only case where an insert-select is executed without using an intermediate spool is when the source and target tables have EXACTLY the same column types. This is due to the fact that the bulk insert step can only work with pre-built rows, so if any reformatting or conversion of the source row is necessary, it must be done by a separate step. The merge statement, on the other hand, uses the new muti-row-merge step, which was introduced in Teradata 12.0 specifically to support the merge statement. The new step always builds new rows by assigning values column-wise from the source, with appropriate implicit casts.