Difficulty Inserting records into Large table

UDA
Teradata Employee

Difficulty Inserting records into Large table

Hello....

I am trying to insert records (72 million+) from one table into a larger "master" table. The master table will be the final table after everything is complete. This table has 330 million rows in it currently. The insert is a simple bteq script:
Insert into "master table"
Select * from "subset table"

I have broken the subset table into smaller table segments (4-10 million) thinking that would help ease the loading times. Even these smaller segments seem to take hours and in some cases days to load.

The master table is defines as a SET table, and has a Primary Index and partitions by year. I have collected statistics on the key and the partitions, but this does not seem to have helped with the time to load.

I was thinking about the SET vs MultiSet definition of the table. Will this help at all?

Every year I will receive another 100 million records that need to be loaded to this table, so improving the processing is important.

Any suggestions you have would be greatly appreciated.

Thanks!

Paul
12 REPLIES
Enthusiast

Re: Difficulty Inserting records into Large table

If the table has a NUPI with lots of records with the same PI value (eg an account number for transactions), then Multiset will definitely help. It will bypass the duplicate row check which is almost certainly be causing the long load times. If you do go to a miltiset table though, you become responsible for ensuring there are no duplicate records!

You should take a look at your partitioning scheme.
I presume that all (or the vast majority) of the records you load are for the most recent period.
You do not say whether you load the 100M records once per year, or as 9M approx per month.
If you load 100M in January, 90M of these will go into last years partition, adding to the 9-10M you loaded for last January. 10M will go into the new partition.
The 10M will go in much quicker because they are going into am empty partition.

Try changing to monthly partitions instead of annual. This will limit you to 65K months, or 5000 years, so probably not a problem.
Senior Apprentice

Re: Difficulty Inserting records into Large table

Hi Paul,
do a HELP STATS on the PI-columns of that table and look for the Mode Frequency to check for duplicates.

Could you post that plus PI + partitioning?

Dieter
Teradata Employee

Re: Difficulty Inserting records into Large table

Hi Dieter,

Here is what I get when I do a Help Statistics on the main table:

Date Time Unique Values Column Names
09/07/23 01:50:20 10,468,831 Irmf_Pyee_Id
09/07/23 01:50:24 6 PARTITION

Is this the info you were asking for?

Jim, the table is partitioned by YEAR. This is tax data I am dealing with, so I will get a new file once a year to load into this table. Unfortunately, I do not have a month that I could use for partitioning, only Tax Year.

Here are the key fields as defined in the DDL now:

PRIMARY INDEX XPKINFORMATION_RETURN ( Irmf_Pyee_Id )
PARTITION BY RANGE_N(CAST((Tax_Yr_Dte ) AS INTEGER) BETWEEN 1980 AND 2100 EACH 1 ,
UNKNOWN)
INDEX Pyertest1 ( Pyer_Adrs_Id )
INDEX Pyeetest1 ( Pyee_Adrs_Id )
INDEX PYerSec2 ( Irmf_Pyer_Id );

Should I collect statistics on all of these index fields, in addition to the PI?

Thanks for all the feedback!

Paul
Senior Apprentice

Re: Difficulty Inserting records into Large table

Hi Paul,
this is approx. 33 rows per value in average, this should be ok (especially if it's in different years).
But the maximum might be totally different, so please do a help stats only for the PI-column:
HELP STATS tablename COLUMN Irmf_Pyee_Id;

If QueryLog is enabled on your site you might also check it for CPU-usage of those ins/sel queries.

Dieter
Teradata Employee

Re: Difficulty Inserting records into Large table

Hi Dieter,

I am going to ask a dumb question. When you say Help Statistics on the PI, what do you mean? In my post above I showed what I get when I do a Help Statistics on the table, and that field listed is the PI. I'm not sure what to do to get the Statistics on the PI.

Or are you asking about the skewness? That is 12.7

Thanks!

Paul
Senior Apprentice

Re: Difficulty Inserting records into Large table

Hi Paul,
i edited my previous reply, i don't know why it didn't show that HELP STATS statement, just tried to use a bold font...

That skew sounds bad, it's probably causing that bad performance.
Changing to MULTISET will probably help, but let's see the output of that HELP STATS.

Dieter
Teradata Employee

Re: Difficulty Inserting records into Large table

Hi Dieter,

When I run a Help Statistics on the table I am inserting into, here is what I get:

Date Time Unique Values Column Names
09/07/23 13:02:26 10,473,786 Irmf_Pyee_Id
09/07/23 13:02:32 6 PARTITION

A 4.5 million row insert just completed, and that took 10 hrs! Something can't be right about how this table is set up.

Thanks!

paul
Senior Apprentice

Re: Difficulty Inserting records into Large table

Hi Paul,
do a help stats for the PI-column
HELP STATS tablename COLUMN Irmf_Pyee_Id;
to get the details for the PI.

4.5m rows in 10 hours is ridiculous, i can do that in a virtual machine on my laptop.

You could also do a:
select Irmf_Pyee_Id, Tax_Yr_Dte, count(*)
from tab
group by 1,2
qualify rank() over (order by count(*) desc) <= 1

to check for duplicates.

Dieter
Teradata Employee

Re: Difficulty Inserting records into Large table

Hi Dieter,

I agree...10 hrs to 4.5 mill is just unacceptable! That is why I am asking the forum for help.

I ran the Help Stats statement you provided (thanks), it produced a pretty long output record. I am not sure what all you need to see, but I took a guess. Here it is:

Number of Rows..........370692047
Number of Nulls...........0
Number of Intervals......100
Number of Uniques........10473786
Numeric......................N
Min Value ...................0
Mode Value.................300027069
Mode Frequency..........5698859
Max Value..................300027069
Mode Value................141688470
Mode Frequency.........4787856

I hope this helps.

Thanks!

Paul