INTEGER/BYTEINT accumulation and presentation

Database
Enthusiast

INTEGER/BYTEINT accumulation and presentation

Comrades !

I ran this query...

sel aaa.Catgy
,sum(case when bbb.Catgy = 'splat0' then 0 else 1 end )as SL

from ptemp.table_1 aaa
left outer join
ptemp.table2 bbb

on aaa.Catgy = bbb.Catgy
and bbb.direct between 1082 and 1812

group by 1;

....and expected to get these results...
Catgy SL
A1111 84
B2222 19
C3333 39
D4444 136
E5555 21
F6666 47

Instead, I got...

Catgy SL
A1111 84
B2222 19
C3333 39
D4444 -120
E5555 21
F6666 47

A whole 256 has been knocked off the value of the results for category D !
I ran a similar query later, and again, 256 was knocked off the result, but this time the result stayed positive.
Looks like an overflow-related kind of issue to me, but no overflow error was generated.
On it's own, the CASE statement has a TYPE of BYTEINT, whilst the SUM has a TYPE of INTEGER.
It looks like Teradata is accumulating the result correctly (as INTEGER), but then just using the 8 right-most bits of the integer to populate a BYTEINT output.

If I change the SUM line to this (forcing the case to INTEGER)...
,sum(case when bbb.Catgy = 'splat0' then 0 else 1 end (INTEGER) )as SL

...then the problem disappears. Interestingly, the EXPLAIN output is ALMOST exactly the same though - the only difference being that STEP 6 uses grouping identifier in field 2 instead of field 1.

Anyone had a similar experience ? Is this a bug or is Teradata simply working as spec'd ?

Any thoughts, ideas, favourite recipes gratefully accepted.

Here's the EXPLAIN...
1) First, we lock a distinct TABLE2."pseudo table" for read on a
RowHash to prevent global deadlock for TABLE2.bbb.
2) Next, we lock a distinct ptemp."pseudo table" for read on a
RowHash to prevent global deadlock for ptemp.aaa.
3) We lock TABLE2.bbb for read, and we lock ptemp.aaa for read.
4) We do an all-AMPs RETRIEVE step from TABLE2.bbb by way of an
all-rows scan with a condition of ("(TABLE2.bbb.DIRECT >=
1082) AND ((TABLE2.bbb.DIRECT <= 1812) AND (NOT
(TABLE2.bbb.CATGY IS NULL )))") into Spool 4 (all_amps), which
is redistributed by hash code to all AMPs. Then we do a
SORT/GROUP to order Spool 4 by row hash and non-aggregate fields
grouping duplicate rows. The size of Spool 4 is estimated with no
confidence to be 11,155 rows. The estimated time for this step is
0.03 seconds.
5) We do an all-AMPs JOIN step from ptemp.aaa by way of a RowHash
match scan with no residual conditions, which is joined to Spool 4
(Last Use) by way of a RowHash match scan. ptemp.aaa and Spool 4
are left outer joined using a merge join, with a join condition of
("ptemp.aaa.CATGY = CATGY"). The result goes into Spool 3
(all_amps), which is built locally on the AMPs. The size of Spool
3 is estimated with index join confidence to be 2,436,252 rows.
The estimated time for this step is 0.25 seconds.
6) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed locally, then placed
in Spool 6. The size of Spool 6 is estimated with index join
confidence to be 1 row. The estimated time for this step is 0.13
seconds.
7) We do an all-AMPs RETRIEVE step from Spool 6 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with index join
confidence to be 1 row. The estimated time for this step is 0.03
seconds.
8) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.

1 REPLY
Enthusiast

Re: INTEGER/BYTEINT accumulation and presentation

Hi , 

Are there any changes in TD13 release ? We also faced similar issues , wherein value of 960 converted it to value -64 ( ripping in multiples of 256) , in TTU12 but when it was migrated to TTU13 , the load with same value of 960 failed due to numeric overflow as byteint normally allows allows only from -128-127?

Regards

Shobhit