Primary Index Reducing Space

General

Primary Index Reducing Space

Hi,

I did the following

1) First, I fastloaded a flat file [Record count: 184368033 ] in to a table [Table column length: 231] that does not have primary index. After loading 1/6th of the file, it showed: No Room in Database error. [Appx: 685 GB]

2) Then, I added primary index to the table. Not only did fastloading complete, but it took only 20 GB for loading the entire file.

Can someone please explain me in simple terms why it took such a huge space while loading a file in to a table w/o primary index, and how does adding primary index decrease the used perm space.
3 REPLIES
Junior Contributor

Re: Primary Index Reducing Space

Was it really a NOPI-table in TD13 defined with NO PRIMARY INDEX or did you just omit the PI definition?
What does a SHOW TABLE return?

If it actually was a NOPI table then there migth have been a lot of dupliacte rows, whch are not removed when the table has no PI.

But if you just didn't specify the PI then the first column was used automagically as PI and this column had a really bad distribution. As each AMP has exactly the same Perm Space (685 GB / number of AMPs) one AMP reached it's limit and threw that "no more room in database" error.
To check for distribution simple query dbc.tablesize without sum/group by to see each AMP's value.

Dieter

Re: Primary Index Reducing Space

Thanks dnoeth for your reply. We didn't specify PI, so you may be right, first column would have been take as PI. Could you please explain little more about AMP distribution with the help of an example and how it is increasing the perm space?
Enthusiast

Re: Primary Index Reducing Space

The 685GB of space in the database is divided evenly among the number of AMPS on your system. As soon as a single AMP on the system reaches its limit the database is considered full. As Dieter mentioned, check the dbc.tablesize ordering by the VPROC column descending.

To get an understanding of how your PI is distributed to the amps:

SELECT HASHAMP(HASHBUCKET(HASHROW())) AS VProc
, COUNT(*)
FROM .GROUP BY 1;