Period data type support in Teradata MultiLoad

Tools
Tools covers the tools and utilities you use to work with Teradata and its supporting ecosystem. You'll find information on everything from the Teradata Eclipse plug-in to load/extract tools.
Teradata Employee

Period data type support in Teradata MultiLoad

The Period data type is supported by Teradata Multiload as of TTU 13.0. A period represents an interval of time. It indicates when some particular event starts and when it ends. It has a beginning bound and an (optional) ending bound (the period is open if there is no ending bound). The beginning bound is defined by the value of a beginning element. The ending bound is defined by the value of an ending element. Those two elements of a period must be of the same type that is one of the three DateTime data types: DATE, TIME, or TIMESTAMP.

A period is a value of a Period data type. Period data types are implemented internally as UDTs. However, the syntax and functions of Period data types closely follow the ANSI proposal and Period data types appear to the user as system-defined data types.

The five new Period data types that were introduced to TTU 13.0 MultiLoad are:

  • PERIOD(DATE)
  • PERIOD(TIME[(n)])
  • PERIOD(TIME[(n)] WITH TIME ZONE)
  • PERIOD(TIMESTAMP[(n)])
  • PERIOD(TIMESTAMP[(n)] WITH TIME ZONE) 

How support for the PERIOD data type is implemented in MultiLoad

The FIELD command specifies a field of the input record to be sent to the Teradata Database. The FILLER command describes a named or unnamed field as filler, which is not sent to the Teradata Database. Prior to TTU13.0, if the user defined a Period data type on the .FIELD command or .FILLER command, MultiLoad rejected the data type with an error message such: UTY0005 Bad data in the FIELD command at position 12, the name beginning with "PERIOD(TIME(2))" is not a valid data descriptor.

Starting from TTU13.0, the new PERIOD data types with nested parentheses are supported on the .FIELD and .FILLER commands.  

The TABLE command identifies a table whose column names and data descriptions are used as the names and data descriptions of fields of the input records.

When the TABLE command is used and the table contains Period data types, the Teradata Database returns the type names as below:

PERIOD(DATE) PD
PERIOD(TIME) PT
PERIOD(TIME WITH TIME ZONE) PZ
PERIOD(TIMESTAMP) PS
PERIOD(TIMESTAMP WITH TIME ZONE) PM

MultiLoad must recognize those new PERIOD data types when obtaining the table schema from the Teradata Database in response to a .TABLE command. Then it can generate the correct FIELD commands to define the input records.

MultiLoad recognizes new data descriptor type codes and generates the correct data type phrases when building the USING modifier.


Internal representation

The max size of the PERIOD data is fixed:

PERIOD Data Type Field Size in Bytes
PERIOD(DATE) 8
PERIOD(TIME(n)) 12
PERIOD(TIME(n) WITH TIME ZONE) 16
PERIOD(TIMESTAMP(n)) 20
PERIOD(TIMESTAMP(n) WITH TIME ZONE) 24

The DBS storage size in bytes are different between 32-bit DBS and 64-bit DBS server, but the actual field size returned to the client will be same on both 32-bit and 64-bit platforms. PERIOD(DATE), PERIOD(TIME(n) and PERIOD(TIME(n) WITH TIME ZONE) are fixed length data types. However, PERIOD(TIMESTAMP[(n)]) and PERIOD(TIMESTAMP[(n)] WITH TIME ZONE) are variable length data types. That means a 2-byte length indicator must precede the data. To support variable data types PERIOD(TIMESTAMP[(n)]) and PERIOD(TIMESTAMP[(n)] WITH TIME ZONE), MultiLoad processes FastLoad, Binary and Unformat data formats by looking at the 2-byte length indicator preceding the data.

The internal representation of PERIOD data does not consist of two strings. The detailed representation for each PERIOD data type is described as the following:

PERIOD(DATE) contains two DATE elements:
DATE type: 4 bytes signed integer
Total: 4*2 = 8 bytes
PERIOD(TIME(n)) contains two TIME elements:
TIME type:
Second: 4 bytes signed integer
Hour: 1 unsigned byte
Minute: 1 unsigned byte
Total: (4+1+1) * 2 = 12 bytes
PERIOD(TIME(n) WITH TIME ZONE) contains two TIME ZONE elements:
TIME ZONE type:
Second: 4 bytes signed integer
Hour: 1 unsigned byte
Minute: 1 unsigned byte
Time Zone Hour: 1 unsigned byte
Time Zone Minute: 1 unsigned byte
Total: (4+1+1+1+1) * 2 = 16 bytes
PERIOD(TIMESTAMP(n)) contains two TIMESTAMP elements:
TIMESTAMP type:
Second: 4 bytes signed integer
Year: 2 bytes signed short integer
Month: 1 unsigned byte
Day: 1 unsigned byte
Hour: 1 unsigned byte
Minute: 1 unsigned byte
Total: 2 + (4+2+1+1+1+1)*2 = 22 bytes
PERIOD(TIMESTAMP(n) WITH TIME ZONE) contains two TIMESTAMP WITH TIME ZONE elements:
TIMESTAMP WITH TIME ZONE type:
Second: 4 bytes signed integer
Year: 2 bytes signed short integer
Month: 1 unsigned byte
Day: 1 unsigned byte
Hour: 1 unsigned byte
Minute: 1 unsigned byte
Time Zone Hour: 1 unsigned byte
Time Zone Minute: 1 unsigned byte
Total: 2 + (4+2+1+1+1+1+1+1)*2 = 26 bytes

Restrictions

PERIOD data is always exported as two consecutive integer values in all response modes, other than a field mode response. So MultiLoad processes PERIOD data as binary structure if  the user provides data as PERIOD data types by defining them as such in the LAYOUT. They can not be processed as TEXT format. It is recommanded to specify TEXT format for character data only. It doesn't make sense to process any binary data as TEXT format.

While the user can supply and define data record as CHAR type, the Teradata Database will cast from CHAR type to the appropriate PERIOD data types. For such case, MultiLoad processes the data as pre-existing CHAR data type. And no special handling is needed at client side. Refer to the sample job script below for loading character data to PERIOD data columns.

Sample job scripts

Specifying UNFORMAT input records on .FIELD command

  1. Run a FastExport job to generate period data file using UNFORMAT:
    .dateform ansidate;

    .logtable PD_fe_log;

    .logon tdpid/xxxx,xxxx;

    drop table datatbl;

    create table datatbl, fallback (

    c1 integer,

    c2 period(timestamp with time zone));

    insert into datatbl(1, period(timestamp '2005-02-03 11:11:11-08:00', timestamp '2005-02-03 13:12:12-08:00'));

    insert into datatbl(2, period(timestamp '2006-02-03 13:12:12-08:00', timestamp '2006-02-03 14:12:12-08:00'));

    insert into datatbl(3, period(timestamp '2007-02-03 14:12:12-08:00', timestamp '2007-02-03 15:12:12-08:00'));

    .begin export;

    sel * from datatbl;

    .export outfile PM_data format unformat mode record;

    .end export;

    .logoff;

  2. Run a MultiLoad job to load PERIOD data to the Teradata Database:
    .dateform ansidate;

    .logtable testlog;

    .logon tdpid/xxxx,xxxx;

    DROP TABLE test_TABLE;

    DROP TABLE wt_test_TABLE;

    DROP TABLE et_test_TABLE;

    DROP TABLE uv_test_TABLE;

    CREATE TABLE test_TABLE, fallback(

    FIELD1 INTEGER,

    FIELD2 period(timestamp with time zone))

    UNIQUE PRIMARY INDEX (Field1);

    .BEGIN import mload tables test_TABLE

    worktables wt_test_TABLE

    errortables et_test_TABLE uv_test_TABLE;

    .layout lay1;

    .FIELD f1 * integer;

    .FIELD f2 * period(timestamp with time zone);

    .dml label label1;

    INSERT into test_TABLE VALUES (:f1,:f2);

    .import infile PM_data layout lay1 apply label1;

    .end mload;

    .logoff;

Using .TABLE command to generate data layout

  1. Run a BTEQ job to generate PERIOD data:
    .logon tdpid/xxxx,xxxx;

    drop table datatbl;

    create table datatbl, fallback (

    c1 integer,

    c2 period(date) );

    insert into datatbl(1, period(date '2005-02-03', date '2006-02-04'));

    insert into datatbl(2, period(date '2006-02-03', date '2007-02-04'));

    insert into datatbl(3, period(date '2007-02-03', date '2008-02-04'));

    .export data file=PD_data;

    sel * from datatbl;

    .export reset;

    .logoff;

  2. Run a MultiLoad job to load PERIOD data:
    .logtable testlog;

    .logon tdpid/xxxx,xxxx;

    DROP TABLE test_TABLE;

    DROP TABLE wt_test_TABLE;

    DROP TABLE et_test_TABLE;

    DROP TABLE uv_test_TABLE;

    CREATE TABLE test_TABLE, fallback(

    FIELD1 INTEGER,

    FIELD2 period(date))

    UNIQUE PRIMARY INDEX (Field1);

    .BEGIN import mload tables test_TABLE

    worktables wt_test_TABLE

    errortables et_test_TABLE uv_test_TABLE;

    .layout lay1;

    .table test_TABLE;

    .dml label label1;

    INSERT into test_TABLE.*;

    .import infile PD_data layout lay1 apply label1;

    .end mload;

    .logoff;

Load CHARACTER data record to columns defined as Period data type

  1. Run a BTEQ job to generate character data file:
    .logon tdpid/xxxx,xxxx;

    drop table datatbl;

    drop table datatbl1;

    create table datatbl, fallback (

    c1 integer,

    c2 char(36),

    c3 char(28),

    c4 char(32),

    c5 char(62));

    insert into datatbl
    (1,
    '(''12:12:12.123'', ''13:12:12.123'')',
    '(''12:12:12'', ''13:12:12'')',
    '(''2005-02-03'', ''2006-02-04'')',
    '(''2005-02-03 11:11:11-08:00'', ''2005-02-03 13:12:12-08:00'')');

    insert into datatbl
    (2,
    '(''13:12:12.123'', ''14:12:12.123'')',
    '(''13:12:12'', ''14:12:12'')',
    '(''2006-02-03'', ''2007-02-04'')',
    '(''2006-02-03 11:11:11-08:00'', ''2006-02-03 13:12:12-08:00'')');

    insert into datatbl
    (3,
    '(''14:12:12.123'', ''15:12:12.123'')',
    '(''14:12:12'', ''15:12:12'')',
    '(''2007-02-03'', ''2008-02-04'')',
    '(''2007-02-03 11:11:11-08:00'', ''2007-02-03 13:12:12-08:00'')');

    .export data file=Period_char_data;

    sel * from datatbl;

    .export reset;

    .logoff;

  2. Run a MultiLoad job to load character data to PERIOD type columns, the Teradata Database will cast from CHAR type to the appropriate PERIOD data types:
    .logtable testlog;

    .logon tdpid/xxxx,xxxx;

    DROP TABLE test_TABLE;

    DROP TABLE wt_test_TABLE;

    DROP TABLE et_test_TABLE;

    DROP TABLE uv_test_TABLE;

    CREATE TABLE test_TABLE, fallback(

    FIELD1 INTEGER,

    FIELD2 period(time(3)),

    FIELD3 period(time(0)),

    FIELD4 period(date),

    FIELD5 period(timestamp with time zone))

    UNIQUE PRIMARY INDEX (Field1);

    .BEGIN import mload tables test_TABLE

    worktables wt_test_TABLE

    errortables et_test_TABLE uv_test_TABLE;

    .layout lay1;

    .FIELD f1 * integer;

    .FIELD f2 * char(36);

    .FIELD f3 * char(28);

    .FIELD f4 * char(32);

    .FIELD f5 * char(62) nullif f4='(''2006-02-03'', ''2007-02-04'')';

    .dml label label1;

    INSERT into test_TABLE VALUES (:f1,:f2,:f3,:f4, :f5);

    .import infile Period_char_data layout lay1 apply label1;

    .end mload;

    .logoff;


Summary

  • Define PERIOD data types on .FIELD command and .FILLER command.
  • Period data types support for TABLE command
  • Generate correct USING CLAUSE for PERIOD data types
  • Handle PERIOD data types