we are planning to migrate our warehouse from greenplum to TD, so our DBA suggested to create the table DDL as below:
Sourece DB(in GP)
create table XXX(
col1 bigint NOT NULL,
col2 character varying(1),
col3 character varying(200)
in target DB (TD) our DBA changed the data types like below
create table XXX(
col1 INTEGER FORMAT '-(19)9' NOT NULL,
col2 CHAR(1) CHARACTER SET UNICODE NOT CASESPECIFIC COMPRESS 'Y',
col3 CHAR(32) CHARACTER SET UNICODE NOT CASESPECIFIC FORMAT 'X(200)'
i understand that the max length of col3 value in GP is 32, so he is trying to create with Char(32) datatype in TD.
but i really dont understand why he is using FORMAT 'X(200)' in TD? when the max length of col3 is 32? shouldn;t it be Format X(32) ?
and i'm confused with bigint data type changing to INTEGER FORMAT '-(19)9' for col1.
i need to understand the significance of FORMAT in table DDL, what happens if we dont give any format ?
If you don't specify a FORMAT there's always a default.
col1: If BIGINT is changed to INT you will get 'numeric overflow' messages when you got values outside of the INT range (regardless of the FORMAT). So better keep BIGINT and FORMAT '-(19)9' is the default for it, so no reason to specify it.
col2: Changing VarChar(1) to CHAR(1) is ok, as Teradata needs two bytes overhead for every VarChar. But this looks like a Y/N flag, so why using Unicode for it? CHARACTER SET LATIN might be better.
col3: Changing VarChar(200) to CHAR(32) is odd. Maybe they checked the actual maximum length and changed that accordingly. But the FORMAT 'X(200)' is useless, this should be FORMAT 'X(32)' which is the default again. And a CHAR should only be used if the average length is at least a few bytes less than defined, otherwise keep VarChar.