select first N records

Database
Enthusiast

select first N records

In Informix, I could do SELECT FIRST 10 * FROM T1 , to get the top 10 records.

In Sql Server or Access I could write SELECT TOP 10 * FROM T1.

Any way to do it in Teradata?
6 REPLIES
Enthusiast

Re: select first N records

Are you trying to get just 10 records?

if this is the case use sample (BUT THIS WILL GIVE ANY 10 records)
select emp_name
from employee
sample 10;

Are you trying to get the first 10 or last 10 based on some order

then you need to use a rank function on the fields that you want to order/rank by
and then use the qualify option to filter the answer set.

select rep_name, sales_count, rank(sales_count) rep_rank
from rep_sales
qualify rep_rank <= 10;

this will give the top 10 reps based on the sales_count

I dont think there is anyway to get the 10 rows based on row_id.
This is because of Teradata architecture.

Junior Contributor

Re: select first N records

Non-Standard TOP syntax is supported since V2R6...

It can be replaced by Standard SQL RANK/PERCENT_RANK/ROW_NUMBER in most cases.

It's only useful for a "select top ... * from table" without any order/where/join instead of SAMPLE on large systems...

Dieter
Enthusiast

Re: select first N records

Hello,

"SELECT TOP n from table" is very fast in some case, even on large tables, and is showing an odd behaviour on other tables,
with a three steps explain .. to get 10 records in one minute ! (V2R6)

How to explain the first all-rows scan ?

3) We do an all-AMPs RETRIEVE step from "tablename" by way of an
all-rows scan with no residual conditions into Spool 2 (all_amps)
(compressed columns allowed), which is built locally on the AMPs.
The input table will not be cached in memory, but it is eligible
for synchronized scanning. The result spool file will not be
cached in memory. The size of Spool 2 is estimated with high
confidence to be 204,849,278 rows. The estimated time for this
step is 1 minute and 24 seconds.
4) We do an all-AMPs STAT FUNCTION step from Spool 2 by way of an
all-rows scan into Spool 5, which is built locally on the AMPs.
The result rows are put into Spool 1 (group_amps), which is built
locally on the AMPs. This step is used to retrieve the TOP 10
rows. Single AMP optimization is used. If this step retrieves
less than 10 rows, then execute step 5.
5) We do an all-AMPs STAT FUNCTION step from Spool 2 (Last Use) by
way of an all-rows scan into Spool 5 (Last Use), which is built
locally on the AMPs. The result rows are put into Spool 1
(group_amps), which is built locally on the AMPs. This step is
used to retrieve the TOP 10 rows.
6) Finally, we send out an END TRANSACTION step to all AMPs involved ...

Re: select first N records

very good information..Thanks for sharing!!
Enthusiast

Re: select first N records

Pierre, I'm looking at that issue too. It seems unfortunate that TD13.0 can't just create one spool and steam it direct to the user? 2+ spools seems to be redundant work. 

Does anybody know of a way to make the select create only one spool of 50 rows??

Explain SELECT TOP 50 *
FROM VW_DB.TBL

1) First, we lock DATA_DB.TBL in view
VW_DB.TBL for access.
2) Next, we do an all-AMPs RETRIEVE step from
DATA_DB.TBL in view
VW_DB.TBL by way of an all-rows scan with a
condition of ("(DATA_DB.TBL in view
VW_DB.TBL.Migr_Ind = 1) AND
((DATA_DB.TBL in view
VW_DB.TBL.Pblcn_End_Dt >= DATE '2012-02-01')
AND (DATA_DB.TBL in view
VW_DB.TBL.Pblcn_Start_Dt <= DATE
'2012-02-01'))") into Spool 6 (all_amps) (compressed columns
allowed), which is built locally on the AMPs. The size of Spool 6
is estimated with low confidence to be 42,294,338 rows (
7,655,275,178 bytes). The estimated time for this step is 5.21
seconds.
3) We do an all-AMPs STAT FUNCTION step from Spool 6 by way of an
all-rows scan into Spool 9, which is redistributed by hash code to
all AMPs. The result rows are put into Spool 5 (group_amps),
which is built locally on the AMPs. This step is used to retrieve
the TOP 50 rows. One AMP is randomly selected to retrieve 50
rows. If this step retrieves less than 50 rows, then execute step
4. The size is estimated with low confidence to be 50 rows (
9,800 bytes).
4) We do an all-AMPs STAT FUNCTION step from Spool 6 (Last Use) by
way of an all-rows scan into Spool 9 (Last Use), which is
redistributed by hash code to all AMPs. The result rows are put
into Spool 5 (group_amps), which is built locally on the AMPs.
This step is used to retrieve the TOP 50 rows. The size is
estimated with low confidence to be 50 rows (9,800 bytes).
5) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 5 are sent back to the user as the result of
statement 1.

Junior Contributor

Re: select first N records

There's no way, you're accessing a view with a WHERE condition, and this is how it currently works. It's actually a bit stupid :-)

But you could create an Enhancement Request, to skip the rest of the rows exceeding in step 2 similar to the "Exit this retrieve step after the first row is found" when explaining

SELECT 'table is not empty"

WHERE EXISTS (SELECT * FROM tab)

Dieter