Keep Duplicate Rows

Database
Enthusiast

Keep Duplicate Rows

Hello,

We are trying to load a table based on a huge flat-file (225 million rows) and there are several rows in there that are duplicates.  However, we need to keep every one of them.  We have tried using a Multiload but it took an hour to get through one million records.  So we either need a method of using FastLoad and keeping the duplicates or some method of greatly increasing the speed of our Multiload. 

I am not the database admin, so I do not have certain info readily available, but I should be able to find answers to any questions you may have that would help me solve this problem.

Any and all suggestions are greatly appreciated.

Tags (1)
4 REPLIES
Junior Contributor

Re: Keep Duplicate Rows

If it's a staging table and you're on TD13 you could use a No Primary Index (NoPI) table, FastLoad is capable of loading duplicates in this case.

Otherwise i wouldn't expect MLoad to be that slow, what's the size of the input records and the table DDL?

Dieter

Enthusiast

Re: Keep Duplicate Rows

The size of the input records is 106

The DDL is :

CREATE MULTISET TABLE DMODOR_STAGE.TB_MITS_ADJ ,NO FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

     (

      NO_TAX_ID CHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC,

      DA_PRD_FILE CHAR(6) CHARACTER SET LATIN NOT CASESPECIFIC,

      NO_DLN_BAS CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,

      NO_DLN_PAY CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,

      ADR_CITY_CODE_PAY CHAR(5) CHARACTER SET LATIN NOT CASESPECIFIC,

      ADR_COUNTY_PAY CHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC,

      CD_SITE_PAY CHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC,

      CD_TAX_ITEM_PAY CHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC,

      DA_PRD_FILE_ADJ CHAR(6) CHARACTER SET LATIN NOT CASESPECIFIC,

      NO_DLN_ADJ CHAR(15) CHARACTER SET LATIN NOT CASESPECIFIC,

      CD_TYPE_ADJ CHAR(4) CHARACTER SET LATIN NOT CASESPECIFIC,

      AMT_FIELD_ADJ DECIMAL(12,2),

      CD_TYPE_TAX_ADJ CHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC,

      DA_LAST_UPD_ADJ CHAR(8) CHARACTER SET LATIN NOT CASESPECIFIC)

PRIMARY INDEX ( NO_TAX_ID );

Senior Supporter

Re: Keep Duplicate Rows

Is NO_TAX_ID skewed?

How many vprocs do you have?

And regarding "it took an hour to get through one million records"

I guess this is referring to the acquisition phase - correct? In case yes, do you have enough bandwidth from your ETL server to the TD system?

Enthusiast

Re: Keep Duplicate Rows

The issue has been resolved using the NoPI option.  Thanks Dieter!