Need to increase the query performance

Database

Need to increase the query performance

Hello All,

I am facing problem to load 5million records into my database table.

Source data comming through flat file, then load into stagging table using Mload. This process completed in less than 2mins.

Then I am trying to load stagging data into Target table, the job take days to complete.

So I splitted the files into 50,000 rows and then tried to load, Now each file (50,000 rows) take more than 5hrs to complete.

The target table already having 1million records. 

My table structure is like this

CREATE SET TABLE PAMKTB.CUSTOMER_POS ,FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

     (

      CUSTOMER_NAME CHAR(4) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      VENDOR_CODE VARCHAR(10) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      PRODUCT_LINE CHAR(3) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      DATE_TRANSMISSION DATE FORMAT 'YYYY-MM-DD',

      WEEK_FISCAL SMALLINT NOT NULL,

      PERIOD_FISCAL SMALLINT NOT NULL,

      YEAR_FISCAL VARCHAR(4) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      QTR_FISCAL SMALLINT NOT NULL,

      CHANNEL VARCHAR(10) CHARACTER SET UNICODE NOT CASESPECIFIC,

      PART_NUMBER_RETAILER VARCHAR(10) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      PART_NUMBER_VENDOR VARCHAR(10) CHARACTER SET UNICODE NOT CASESPECIFIC,

      PRODUCT_DESCRIPTION VARCHAR(50) CHARACTER SET UNICODE NOT CASESPECIFIC,

      STOCKING_LOCATION_CODE CHAR(3) CHARACTER SET UNICODE NOT CASESPECIFIC,

      DISCONTINUED_FLAG CHAR(1) CHARACTER SET UNICODE NOT CASESPECIFIC,

      LOCATION_NUMBER VARCHAR(10) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      LOCATION_TYPE VARCHAR(10) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      TRANSACTION_TYPE VARCHAR(10) CHARACTER SET UNICODE NOT CASESPECIFIC,

      SALES_DOLLARS_GROSS DECIMAL(6,2),

      SALES_UNIT_GROSS INTEGER,

      SALES_TRANS_PROD_COUNT SMALLINT,

      RETURN_DOLLARS DECIMAL(6,2),

      RETURN_TRANS_PROD_COUNT SMALLINT,

      RETURN_UNITS SMALLINT,

      POS_TRAN_PRODUCT_COUNT SMALLINT,

      SALE_DOLLARS_NET DECIMAL(6,2),

      SALES_UNITS_NET INTEGER)

PRIMARY INDEX ( CUSTOMER_NAME );

I am using below bteq script to load the data into CUSTOMER_POS table

.RUN FILE $LOGON;

.IF ERRORCODE <> 0 THEN .EXIT ERRORCODE;

INSERT INTO $TG_DB_NAME.CUSTOMER_POS

(CUSTOMER_NAME

,VENDOR_CODE

,PRODUCT_LINE

,DATE_TRANSMISSION

,WEEK_FISCAL

,PERIOD_FISCAL

,YEAR_FISCAL

,QTR_FISCAL

,CHANNEL

,TRANSACTION_TYPE

,PART_NUMBER_RETAILER

,PART_NUMBER_VENDOR

,PRODUCT_DESCRIPTION

,STOCKING_LOCATION_CODE

,DISCONTINUED_FLAG

,LOCATION_NUMBER

,LOCATION_TYPE

,SALES_DOLLARS_GROSS

,SALES_UNIT_GROSS

,SALES_TRANS_PROD_COUNT

,RETURN_UNITS

,RETURN_DOLLARS

,RETURN_TRANS_PROD_COUNT

,POS_TRAN_PRODUCT_COUNT

,SALE_DOLLARS_NET

,SALES_UNITS_NET

)

SELECT

'ADVC' AS CUSTOMER_NAME

,'492' AS VENDOR_CODE

,'EC' AS PRODUCT_LINE

,CAST(CURRENT_DATE AS DATE FORMAT 'YYYY-MM-DD') AS DATE_TRANSMISSION

,WEEK_FISCAL

,PERIOD_FISCAL

,YEAR_FISCAL

,QTR_FISCAL

,CHANNEL

,' ' AS TRANSACTION_TYPE

,PART_NUMBER_RETAILER

,PART_NUMBER_VENDOR

,PRODUCT_DESCRIPTION

,STOCKING_LOCATION_CODE

,DISCONTINUED_FLAG

,LOCATION_NUMBER

,'CORP STORE' AS LOCATION_TYPE

,SALES_DOLLARS_GROSS

,SALES_UNIT_GROSS

,SALES_TRANS_PROD_COUNT

,RETURN_UNITS

,RETURN_DOLLARS

,RETURN_TRANS_PROD_COUNT

,POS_TRAN_PRODUCT_COUNT

,CAST(CAST(SALES_DOLLARS_GROSS AS DECIMAL(10,2))-CAST(RETURN_DOLLARS AS DECIMAL(10,2)) AS DECIMAL(10,2)) AS SALE_DOLLARS_NET

,CAST(CAST(SALES_UNIT_GROSS AS DECIMAL(10,0))-CAST(RETURN_UNITS AS DECIMAL (10,0)) AS DECIMAL(10,0)) AS SALES_UNITS_NET

FROM $ST_DB_NAME.CUSTOMER_POS_STG;

.IF ERRORCODE <> 0 THEN .EXIT ERRORCODE;

DELETE FROM $ST_DB_NAME.CUSTOMER_POS_STG;

.IF ERRORCODE <> 0 THEN .EXIT ERRORCODE;

COLLECT STATISTICS $TG_DB_NAME.CUSTOMER_POS INDEX (CUSTOMER_NAME);

.IF ERRORCODE <> 0 THEN .EXIT ERRORCODE;

.LOGOFF;

Can some one please help me in this problem.

Thanks in Advance...

Regards,

Sekhar 

2 REPLIES
Junior Contributor

Re: Need to increase the query performance

Hi Sekhar,

you're inserting 5 million rows with the same PI to a SET table, that's worst case.

Changing the table to be MULTISET would help, but you better change the PI to something usefull. 

Enthusiast

Re: Need to increase the query performance

 As suggested by doneth, Change the target table(PAMKTB.CUSTOMER_POS) as a MULTISET TABLE. Then in your select statement add group by condition to remove duplicate records getting inserted to the table. This will load your table in maximum of 10-15 mins.

Your query is taking time because of the SET table and customer_name is not unique primary index, causing the inserted rows to be checked multiple times for same customer name.