Caculate how many bytes of string

Database

Caculate how many bytes of string

oh... I was looking for this button for a long time... why not put it on the top of the page.

Dears,

I actually have 2 questions:

  1. I want to cacculate how many bytes(not characters) a string contains. Does Teradata provide the built-in function for this? I know there is the CHARACTER_LENGTH function for computing characters.
  2. How Teradata compares two characters? Does it depends on the character encoding of the database?

    For example: I want to check if '£' is greater than '¥' in the database. I think I need to know the character encoding of the DB and then check the corresponding code of the two characters to see db behave correctly, rght? If so, is there a SQL to fetch the encoding setting of this?

Thanks,

RR

Tags (1)
5 REPLIES

Re: Caculate how many bytes of string

Ricky:

1. OCTET_LENGTH():

 BTEQ -- Enter your SQL request or BTEQ command:

SELECT OCTET_LENGTH('En un lugar de La Mancha...') LENGTH_DEFAULT,

       OCTET_LENGTH('En un lugar de La Mancha...', ASCII) LENGTH_ASCII,

       OCTET_LENGTH('En un lugar de La Mancha...', UTF8) LENGTH_UTF8,

       OCTET_LENGTH('En un lugar de La Mancha...', UTF16) LENGTH_UTF16;

 *** Query completed. One row found. 4 columns returned.

 *** Total elapsed time was 1 second.

LENGTH_DEFAULT  LENGTH_ASCII  LENGTH_UTF8  LENGTH_UTF16

--------------  ------------  -----------  ------------

            27            27           27            54

2. Collations (sequences) control character ordering and comparison operations during Teradata Database sessions.

HTH.

Cheers.

Carlos.

Re: Caculate how many bytes of string

Thanks Carlos.

For the second question, I agree with you that the collations control the ordering. But how could i know the setting in current session.

For example, If the encoding is ASCII, 'a'(97) is greater than '3'(51) while 'a'(129) is less than '3'(243) if the encoding is EBCDIC. Well, I will do a little researching about this for Teradata.

Thanks,

RR

Re: Caculate how many bytes of string

Setting for current session can be displayed with help session

Here if you need to do comparision then you can set the collation according to your need and then compare. So if you say requirement is to compare EBCDIC values, then you do

SET SESSION COLLATION EBCDIC;

then you can compare values and it will be in that collation sequence.

Hope it helps

Harpreet

Re: Caculate how many bytes of string

Thanks Harpreet.

And I got another question related.

If the collation is ASCII and I want to compare two characters which are not in the scope of ASCII, how the comparation does? The character will be converted to unicode(assuming the character set is utf8) and then do the comparation? right?

For example(the collation is ascii and character set is UTF8):

SELECT (CASE WHEN ('¥'<= '£') THEN 1 ELSE 0 END) AS "newField"

0

SELECT (CASE WHEN ('¥'<= 'A') THEN 1 ELSE 0 END) AS "newField"

0

SELECT (CASE WHEN ('丼'<= 'A') THEN 1 ELSE 0 END) AS "newField"

1

How to explain the results from the above 3 commands?

Thanks

RR

Re: Caculate how many bytes of string

Dears,

I think i am clear with this problem.

Because the collation is ASCII. So, characters out of ascii scope will be considered as '?' in ascii.  And 63(HEX 3F) is the ascii code for '?'.

The following test results is from java code.

Test ASCII encoding....

丼: Decimal: 63, Hex: 3F

¥: Decimal: 63, Hex: 3F

£: Decimal: 63, Hex: 3F

?: Decimal: 63, Hex: 3F

A: Decimal: 65, Hex: 41

So, for the three charaters, the ascii of them are all less than 'A'(I had a typo in the last thread for the result of the second sql. The result should be 1).

But what I did not understand is that why ''<='¥' returns false in the first sql?(they have the same ascii code, 1 should be returned if my explaination for the encoding/decoding is right)

Also, I want to change the session collation to UTF8 with SET SESSION COLLATION UTF8/UTF-8 but failed because of Error: [Teradata Database] [TeraJDBC 14.00.00.14] [Error 3706] [SQLState 42000] Syntax error: Expected a collation value. thoughts?

Thanks,

RR