Question on HASHROW over cast date to VARCHAR

Database
Highlighted
Enthusiast

Question on HASHROW over cast date to VARCHAR

I have a column X that has TIMESTAMP datatype in table A and another table B with column Y with varchar(10) datatype

 

I am comparing these 2 rows using HASHROW to match so i can skip inserting the record in TableA.

 

Table A

Ria 2017-01-01

HASHROW(A.BIRTH_DATE)=0x50cf076d

 

Table B

Ria 2017-07-01 00:00:00

HASHROW(cast(cast(B.PA_BIRTH_DATE as FORMAT' YYYY-MM-DD') AS VARCHAR(10)))=0x3d5ec813

 

Can you help me understand why the values are different even after casting to same datatype?

 

 

 

 

 

 


Accepted Solutions
Junior Contributor

Re: Question on HASHROW over cast date to VARCHAR


Possibly a way to achieve what you want is to convert the TS column to VARCHAR(10) and then HASHROW the VC value.

Hi Dave, this exactly what she did (just confusing the table names) and as soon as you use the same values the hashes match :-)

 

 

Table A

Ria 2017-01-01

Table B

Ria 2017-07-01 00:00:00

 

 

 

SELECT
   HASHROW('2017-01-01')
  ,HASHROW(cast(cast(timestamp '2017-07-01 00:00:00' as FORMAT' YYYY-MM-DD') AS VARCHAR(10)))
1 ACCEPTED SOLUTION
8 REPLIES
Apprentice

Re: Question on HASHROW over cast date to VARCHAR

Hi Nina,

 

They generate different HASHROW outputs because they are different data types - and different data values.

 

In one table you have a TIMESTAMP column and in the other you have a VARCHAR(10). These are represented by different byte values and the hashing algorithm reads the byte values.

 

Also remember that a Teradata TIMESTAMP column (in Table A) ALWAYS contains a time value, even if it is '0' or if the tool does not display it.

 

In your example below, Table A contains '2017-01-01 00:00:00' (assuming a TS(0) column).

In your SQL you cast to a date value ('2017-01-01').

 

Does that help?

 

Cheers,

Dave

In your

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Apprentice

Re: Question on HASHROW over cast date to VARCHAR

Sorry, I should also have said...

 

Possibly a way to achieve what you want is to convert the TS column to VARCHAR(10) and then HASHROW the VC value.

 

Why do you want to use HASHROW anyway?

 

Why not just convert one column to the data type of the other and then compare them?

 

Cheers,

Dave

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
Junior Contributor

Re: Question on HASHROW over cast date to VARCHAR


Possibly a way to achieve what you want is to convert the TS column to VARCHAR(10) and then HASHROW the VC value.

Hi Dave, this exactly what she did (just confusing the table names) and as soon as you use the same values the hashes match :-)

 

 

Table A

Ria 2017-01-01

Table B

Ria 2017-07-01 00:00:00

 

 

 

SELECT
   HASHROW('2017-01-01')
  ,HASHROW(cast(cast(timestamp '2017-07-01 00:00:00' as FORMAT' YYYY-MM-DD') AS VARCHAR(10)))
Enthusiast

Re: Question on HASHROW over cast date to VARCHAR

Hi, i am using HASHROW to compare 2 records in 2 different tables to determine to skip or insert the record

Enthusiast

Re: Question on HASHROW over cast date to VARCHAR

Thanks @dnoeth it worked.

Junior Contributor

Re: Question on HASHROW over cast date to VARCHAR

Caution: When you run HashRow over multiple columns different rows might hash the same:

 

HashRow(1,2,'foo','bar') = HashRow(2,1,'bar','foo')

Enthusiast

Re: Question on HASHROW over cast date to VARCHAR

Do you suggest comparing column by column instead of ROWHASH?

 

i am thinking of using, the following 

 

COALESCE(a.columnA,'X')||a.COALESCEcolumnB,'X')||a.COALESCEcolumnC,'X')||a.COALESCEcolumnD,'X')

<>

COALESCE(b.columnA,'X')||COALESCE(b.columnB,'X')||COALESCE(b.columnC,'X')||COALESCE(b.columnD,'X')

 

But i am having issue with COALESCE with columnA, which is TIMESTAMP(6) in Table B and varchar(26) in Table A (that column can be nullable).

 

I tried casting and coalesce in different ways and i am dead in the water!

select *

from

table A innser join table B on (a....=b....)

where

COALESCE(A.columnA,'1111-11-11')

=

COALESCE(cast(B.columnB AS VARCHAR(26)),'1111-11-11')

Error:

(Executed as Single statement.  Failed [3800 : HY000] Datatype Mismatch in THEN/ELSE expression.)

 

 

Apprentice

Re: Question on HASHROW over cast date to VARCHAR

Hi Nina,

 

At some point you will have to compare data values, you cannot rely on HASHROW function alone because it is possible for two different data values to generate the same row hash (output from HASHROW function) value.

 

I can't see an obvious reason why you're getting the 'data type mismatch' error.

Do you get the error on the SELECT processing or the INSERT processing?

What release of TD are you using?

 

You might want to try the following, it seems to do the right thing:

-- target table
CREATE TABLE tt1
(picol1 INTEGER NOT NULL
,picol2 INTEGER NOT NULL
,datacol1 TIMESTAMP(6)
,datacol2 CHAR(10))
PRIMARY INDEX(picol1,picol2);

-- source table
CREATE TABLE st1
(picol1 INTEGER NOT NULL
,picol2 INTEGER NOT NULL
,datacol1 VARCHAR(26)
,datacol2 CHAR(10))
PRIMARY INDEX(picol1,picol2);

-- Insert/select code
INSERT INTO tt1
SELECT s.picol1
   ,s.picol2
   ,CAST(s.datacol1 AS TIMESTAMP(6))
   ,s.datacol2
FROM st1 AS s
WHERE NOT EXISTS (SELECT 1
                  FROM tt1
		 WHERE tt1.picol1 = s.picol1
		   AND tt1.picol2 = s.picol2
		 AND COALESCE(CAST(tt1.datacol1 AS VARCHAR(26)),'1900-01-01 00:00:00.000001')
		            = COALESCE(s.datacol1,'1900-01-01 00:00:00.000001')
		AND COALESCE(tt1.datacol2,'fakevalue') = COALESCE(s.datacol2,'fakevalue'));

Does that do what you want?

 

Cheers,

Dave

 

 

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com