Problem getting sample records using "top"


Problem getting sample records using "top"


We were trying to get a sample of 200K records. Initially we were testing if we could get those samples using "top". But it threw a spool space error.
Later on we got those samples using rank ( as 1 of the methods ) .

What we want to know is that why "top" did not work in this case? Is there any limit/threshold for it?
Would the same happen if we use "sample"?
How do each 1 of them exactly work?


Re: Problem getting sample records using "top"


i think your spool space must be increase

Teradata strongly recommends that you give sufficient space to DBC to be used as spool.
There are two ways to do this: set aside permanent space to remain unused or create an empty
database to hold the space.

The advantage of creating a database to hold spool space means that
space will always be set aside and available for spool usage and will not be mixed with the
space allocated for DBC.

Reserving a specific amount of space from your total space in DBC helps ensure the entire
system will run smoothly and properly because if DBC runs out of spool space, you may run
into some problems with performance, unsuccessful queries, or user logon.

Re: Problem getting sample records using "top"

Thanks for the reply.
But what if we don't have the privilege to increase the spool space. The DBAs are reluctant to do so.
Is there any work around?

Re: Problem getting sample records using "top"

Have you tried SAMPLE?

While TOP is designed for speed it may be overloading a single (or a couple) AMP causing your spool to skew by going to a few AMPs to get the data. SAMPLE would almost certainly be an all AMP operation and you may find a more even spool distribution but will sacrifice speed due to the statistical nature of SAMPLE.

ROW_NUMBER() would also work but as an OLAP function with an ORDER BY it will probably be fairly CPU intensive.

Re: Problem getting sample records using "top"

Thanks! I guessed so.
We were apprehensive using SAMPLE considering that our table had around 20M records.