Error 2803 Secondary index uniqueness violation error utf-8 impact?

Database
jld
Enthusiast

Error 2803 Secondary index uniqueness violation error utf-8 impact?

Hi there,

I assume I have a pretty naughty case .

I have a dimension table with a secondary unique index (COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE) from columns with datatype

VARCHAR() CHARACTER SET UNICODE NOT CASESPECIFIC

and for a while this table was fed with not properly converted utf-8 string since the feeding apps driver did not support utf-8, now the  feeding apps driver is utf-8 compatible and I get (on first run) the error above when loading my dimension table.

I have like a first guess that the the index representation might not be the same as the "restituted" representation of the column´s value and that  in CITY_TXT the earlier value  ???????? (on Target) and the new ΠΕΙΡΑΙΑΣ (InFLow)  when comparing on COUNTRY_ISO_A2 , ZIP_CODE  would have the same index representation

How can I check that?

Any other suggestion welcome since I do not see after "functional" control not other reason

thanks

Rgds

JL D

Tags (1)
4 REPLIES
Junior Contributor

Re: Error 2803 Secondary index uniqueness violation error utf-8 impact?

Are you shure the duplicate is not in your staging table?

sel COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE 
from staging
group by 1
having count(*) > 1

And to find out which rows might cause that error:

sel COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE from staging
intersect
sel COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE from target

If those queries don't return any rows, it might be due to some rules for comparing strings in UTF?

To find out which rows caused the error you could create a copy of the target table with a UPI on those columns, do a CREATE ERROR TABLE and then use a MERGE or INS/SEL with LOGGING ALL ERRORS. Thus there's no unspecific "a row caused an error", but the rows are recorded within the error table.

Dieter

jld
Enthusiast

Re: Error 2803 Secondary index uniqueness violation error utf-8 impact?

Hi Daniel,

thanks

both queries returned no occurrence, so I am left with the next control.

rgds

JL D

jld
Enthusiast

Re: Error 2803 Secondary index uniqueness violation error utf-8 impact?

Sorry Dieter for the wrong name

JL

jld
Enthusiast

Re: Error 2803 Secondary index uniqueness violation error utf-8 impact?

Hi there,

probably this explains the problem

CREATE MULTISET TABLE test.geo_dim2 ,NO FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

     (

      GEO_SK BIGINT GENERATED ALWAYS AS IDENTITY

           (START WITH 1

            INCREMENT BY 1

            MINVALUE 1

            MAXVALUE 2147483647

            NO CYCLE),

      COUNTRY_ISO_A2 CHAR(2) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      CITY_TXT VARCHAR(100) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,

      ZIP_CODE VARCHAR(30) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL)

UNIQUE PRIMARY INDEX ( GEO_SK )

UNIQUE INDEX ( COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE );

insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('GR','ΚΩΣ' ,'85300');

  OK

insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('GR','Κως' ,'85300');

  Error: [Teradata Database] [TeraJDBC 14.00.00.30] [Error 2803] [SQLState 23000] Secondary index uniqueness violation in EP_DEV1.geo_dim2

insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('BE','ΚORTRIJK' ,'8500');

  OK

insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('BE','Κortrijk' ,'8500');

  Error: [Teradata Database] [TeraJDBC 14.00.00.30] [Error 2803] [SQLState 23000] Secondary index uniqueness violation in EP_DEV1.geo_dim2

select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='GR' and GEO.ZIP_CODE='85300' AND GEO.CITY_TXT='ΚΩΣ'

  returns 1 row

select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='GR' and GEO.ZIP_CODE='85300' AND GEO.CITY_TXT='Κως'

  returns 0 row   <<<<===================

  the process discussed earlier is doing an

  insert into target

  select from staging

  where staging column part of the key not in (select column part of the key from target table)

select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='BE' and GEO.ZIP_CODE='8500' AND GEO.CITY_TXT='ΚORTRIJK'

  returns 1 row

select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='BE' and GEO.ZIP_CODE='8500' AND GEO.CITY_TXT='Κortrijk'

  returns 1 row

Nevertheless, I still do not know why tests on another Teradata box returned 1 row in all case

and 'help session' cmd on both machines return the same setup

?

rgds

JL D