Random Identify Column creation

General
Fan

Random Identify Column creation

We are having issues with IDENTIFY key creation.

Script:

CREATE SET TABLE dedw.ETL_RUN_LOG_POC ,NO FALLBACK ,

NO BEFORE JOURNAL,

NO AFTER JOURNAL,

CHECKSUM = DEFAULT,

DEFAULT MERGEBLOCKRATIO

(

ETL_Run_Log_ID INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY

(START WITH 1

INCREMENT BY 1

MINVALUE -2147483647

MAXVALUE 2147483647

NO CYCLE),

------ Other Columns-----

CONSTRAINT PK8_2 PRIMARY KEY ( ETL_Run_Log_ID ))

However the generation is random, and not increasing with time.

Col Id        Time

6000789   5.30 PM

15             5.35 PM

We need this to be fixed.

3 REPLIES
Junior Contributor

Re: Random Identify Column creation

This behaviour is documented, it's not a SEQUENCE, but an IDENTITY :-)

According to Standard SQL IDENTITY doesn't need to be sequential and does not have to be assigned in chronological order.

In fact in Teradata (as a parallel system) an IDENTITY is not assigned by a single process (which would be a bottleneck), but in parallel by each PE/AMP and each reserves a batch of values.

Why do you think you need a chronologically assigned value? 

What if multiple rows are inserted at exactly the same time?

Teradata Employee

Re: Random Identify Column creation

The INCREMENT BY 1 clause in the CREATE TABLE command can be misleading to people, because it implies that the values will be assigned sequentially with no gaps, but that's not how the Teradata Database works.

As Dieter noted, the Teradata Database's behavior is documented. Specifically, the behavior is documented in the "Teradata Database Reference / SQL Data Definition Language / Detailed Topics" book, in Chapter 5 "CREATE TABLE", in the section "Process for Generating Identity Column Numbers".

Here are a couple of relevant excerpts:

"Teradata Database uses a batch numbering scheme to generate identity column numbers. When the initial batch of rows for a bulk insert arrives at a PE or AMP, the system reserves a range of numbers before it begins to process the rows."

...

"This process explains the following apparent numbering anomalies. Because the Teradata architecture is highly parallel, generated identity column numbers do not necessarily reflect the chronological order in which rows are inserted."

Fan

Re: Random Identify Column creation

Thanks for confirming this. We need an ETL job that connects to teradata in real time, to produce sequence keys.