Using SQL to Query for NaN

Database

Using SQL to Query for NaN

I have a table (I'll call this DTM) which has a column of the type FLOAT (I'll call this VAL).  This table currently includes rows in which the value of that column is NaN (not a number).  We are inserting these values through DataStage / Java, which is taking the values of "NaN", "+Infinity" and "-Infinity" (all legit IEEE 754 floats) from a text file and is (somehow) effectively inserting those into the database.

I would like to run a query which calculates the sum of the squared values of VAL, while excluding the values which are NaN or whose absolute values exceed 1e+21 (which also happens to exclude +Infinity and -Infinity).  For instance:

    SELECT SUM(VAL**2) FROM DTM WHERE ABS(VAL) < 1e+21 AND NOT VAL = CAST ('NaN' AS FLOAT)

The above fails with error 2620, "The format or data contains a bad character.", specifically on the CAST.  I've tried simply "... AND NOT VAL = 'NaN'", which also fails for a similar reason (3535, "A character string failed conversion to a numeric value.").  I cannot seem to figure out how to represent NaN within the SQL statement.

Even if I could represent NaN successfully in an SQL statement, I would be concerned that the comparison would fail.  According to the IEEE 754 spec, NaN = NaN should evaluate to false. What I really seem to need is an IsNaN() function.  Yet that function does not seem to exist.  Of course, if I *could* represent NaN within a Teradata SQL statement, and NaN = NaN happened to evaluate to true, I would not complain.

I would appreciate any help you could provide!

Tags (4)
1 REPLY

Re: Using SQL to Query for NaN

I have figured out a work-around, and I'll share that with those of you searching for a solution.

First of all, it's obvious to me that Teradata's handling of NaN floating point values is incomplete and any behavior I've stumbled into figuring out is likely unintentional and inconsistent across different versions.  So I offer the following free advice, with no guarantees, promises, or liabilities of any kind.  Caveat emptor.

I'll start with the following desire:

SELECT
  SUM(1) AS CNT_VAL,
SUM(VAL) AS SUM_VAL,
SUM(VAL**2) AS SUM_VAL_SQR
FROM
  DTM

Since my DTM table has values which are NaN, I get error [2622], "Bad argument for ** operator."  After drilling down into the data, I find that if I CAST the FLOAT value to a VARCHAR(50), the NaN values come out as a string of 22 asterisks ("**********************").  I can cast to a VARCHAR(1) instead, and the NaN comes out as a single asterisk "*".  That comparison isn't all that bad.

WHERE
  NOT CAST (VAL AS VARCHAR(1)) = '*'

Adding the WHERE clause above does the trick for screening out the NaNs.  But I also have some large bogus values I would like to screen out as well, so I alter the WHERE clause some more:

WHERE
  NOT CAST (VAL AS VARCHAR(1)) = '*' AND
  NOT ABS(VAL) > 1.0e+21

Here, I have a problem again.  I get error [2651], "Operation Error computing expression involving _____."  After much investigation, I found the problem here is in the equality portion of the floating point comparison.  What "equality portion," you ask?  What seems to be happening here is that the "NOT >" is transformed to "<=" under the covers.  I get the same [2651] error if I use "ABS(VAL) <= 1.0e+21".  "NOT ABS(VAL) >= 1.0e+21" succeeds, but I decided this is more straight-forward:

WHERE
  NOT CAST (VAL AS VARCHAR(1)) = '*' AND
  ABS(VAL) < 1.0e+21

Equality with the exact floating point value of 1.0e+21 is irrelevant, so functionally this works and accomplishes what I'm trying to do.

The "ABS(VAL) < 1.0e+21" also screens out the NaN values, and thus it seems like I could remove the CAST and comparison to '*'.  But this seems to work only intermittently.  I cannot reproduce this now, but I have in my history examples of "ABS(VAL) <= 1.0e+21" both working and not working for screening out NaN.  Now, I can't get "<=" to work at all.   So, I'm leaving the CAST in place, knowing there's a performance hit, but I'm only doing this for a short time as we have stopped the loading of NaN into the table from this point forward.

For the search engines, while investigating this I also ran into errors [2650], "Numeric processor operand error."; [2621], "Bad character in format or data of _____."; and [2623], "Bad argument involving _____ for ** operator." all related to the NaN floating point values.  Hopefully this helps future searchers.