Space required for Null values

Database
Enthusiast

Space required for Null values

Ladies and Gentlemen,

I've done some research with the google tool and cannot find a satisfactory or non-conflicting answer to this question.

 

--scenario: A business user wants to create a table that has historically been two tables. Only about 40% of the columns from the original two columns overlap. This will lead to at least 30% of the columns being null for any 1 row record. Gut (and educated) reaction is that this is a bad idea.

Business may listen to a well thought out argument including how much space will be wasted by having so many NULLs. Later I will be investigating how much of the primary indexes overlap. I am very aware the scenario is a bad design choice. What i need is the proof.

I'm trying to find a definitive answer on how much space NULLs take per data type. I would like to include ANY overhead for the data type like '2 bytes for varchar size definition' or '2 bytes for varchar variable offset' (not sure on that one).

I found this article that provided some clarity but doesn't give me a definitive answer as I think it's a little bit older and may not be 100% accurate to TD 14 that we are using.

 

--Question: How many bytes does a null take per data type? Including any overhead for the column due to the data type. We will absolutely be using compression, but I'll take answers, references, or formulas that don't account for it. We're using TD 14.

 

Thank you very much for any assistance you can give.


Accepted Solutions
Teradata Employee

Re: Space required for Null values

Perhaps I was to stingy with my words:-)

 

No datatypes will require any space for NULL if COMPRESS is used. The variable length types( varchar, varbyte, Number,... ) will not require the length bytes. 

 

There is a section named Base Table Row Format->General Row Structure which specifically states that no space is required for any compressed values and there is a section on compression in the database design manual. There is a whole manual on data types which describes datatypes in excruciating detail.  

 

There are lots of reasons to avoid a design with lots of nulls but space is not one of them as long as COMPRESS is used.

 

If there are a lot of new columns without values for past records, sometimes it is just unavoidable to have nulls for those columns. One alternative is to use DEFAULT values when adding those columns, filling them with a value instead (eg zero). And again if COMPRESS is used with the default value, no space will be required (except the presence bits).

1 ACCEPTED SOLUTION
8 REPLIES
Teradata Employee

Re: Space required for Null values

If COMPRESS is specified on the nullablle column definitions (with or without any specific values to compress) then NULL values in those columns will require zero space except for one bit in each row header. 

 

Without COMPRESS, the NULL value will require the same space as the datatype would with a value in it. And it will still require the bit in the row header to identify the NULL value.

Enthusiast

Re: Space required for Null values

I understand the concept there, but I don't think that fully addresses the question. A varchar data type will still require the 2 bytes (?) for the length even with a null compressed bit.  But a char data type would not?

I'm sorry but that answer scratches the surface of the issue and is the same thing I found from my google searches. Are there references on exactly how each data type is stored in a record at the bit level?

I'm talking about 1-15 million records minimum in each table so with 2-30 million records, the potential wasted space could be significant.

Enthusiast

Re: Space required for Null values

In regards to my last answer, is this what I am looking for with bit level information?  http://www.info.teradata.com/HTMLPubs/DB_TTU_13_10/index.html#page/Database_Management/B035_1094_109...

 

and if it is, then does a compressed bit just exist or is there the presence bit always to tell it that there is a compressed value? I'm not sure how a compressed value is identified.  Smiley Sad

 

Teradata Employee

Re: Space required for Null values

Perhaps I was to stingy with my words:-)

 

No datatypes will require any space for NULL if COMPRESS is used. The variable length types( varchar, varbyte, Number,... ) will not require the length bytes. 

 

There is a section named Base Table Row Format->General Row Structure which specifically states that no space is required for any compressed values and there is a section on compression in the database design manual. There is a whole manual on data types which describes datatypes in excruciating detail.  

 

There are lots of reasons to avoid a design with lots of nulls but space is not one of them as long as COMPRESS is used.

 

If there are a lot of new columns without values for past records, sometimes it is just unavoidable to have nulls for those columns. One alternative is to use DEFAULT values when adding those columns, filling them with a value instead (eg zero). And again if COMPRESS is used with the default value, no space will be required (except the presence bits).

Enthusiast

Re: Space required for Null values

Hmm... Okay. I think I see what you are saying.

 

So, for Varchar nullable compress (NULL, 'on-time') with a null value would be something like 

Row header

  1  Varchar presence bit

  1  Compression bit for selecting null from table header because there is another compressable value (?)

Value in row

  2 bytes offset for varchar

  literally no bits assigned to value

 

And for an Integer nullable compress with a null value would be something like

Row header 

  1 presence bit

Value in row

  literally no bits given

 

and etc...  What I'm reading implies to me that Varchar is the only one with the 2 byte offset?

 

Teradata Employee

Re: Space required for Null values

Correct except that the variable length types do not require that the length bytes be stored. So varchar does not require two bytes for length if NULL is provided as the value of the column. Likewise if compressing a value, the length of the value is not stored in the row, only in the compressed value list in the table header - no space required in the row.

Enthusiast

Re: Space required for Null values

Excellent.  Thank you. This is very much appreciated.

Enthusiast

Re: Space required for Null values

additional documentation that also talks about this in depth.

http://www.info.teradata.com/htmlpubs/DB_TTU_14_00/index.html#page/Database_Management/B035_1094_111...

 

'Base Table Row Format from the Teradata' from Teradata Database, Tools and Utilities Release 14.00