Accidental Cartesian Join Bug

Database

Accidental Cartesian Join Bug

I've recently noticed that the Teradata parsing engine doesn't pick up on / given an error for 'accidental cartesians' or, to use another term to make it more clear why I think this is a significant bug, 'unreferenced / unqualified tables'.

For example, Teradata allows me to execute a query selecting customers with addresses that are valid as at now:

SELECT
Customers.CustomerID
FROM
Customers
INNER JOIN Addresses
ON Customers.CustomerID = Addresses.CustomerID
WHERE
Addresses.End_Date > CURRENT_TIMESTAMP

All well and good.

If I then write another query, for example, to identify customers that have a valid phone number, I might choose to copy and paste the above code and adjust it accordingly:

SELECT
Customers.CustomerID
FROM
Customers
INNER JOIN Phone_Numbers
ON Customers.CustomerID = Phone_Numbers.CustomerID
WHERE
Addresses.End_Date > CURRENT_TIMESTAMP

NOT all well and good. I missed the reference to "Addresses" in the WHERE condition and in doing so have created an accidental cartesian join and it's returning every customer that has had a phone number recorded (at some point) repeated for every address entry that is valid regardless of whether that address relates to the CustomerID or not!

I'm amazed that Teradata doesn't notify me of the massive error I've made. I should absolutely be getting a notification that I've used a table that hasn't been referenced in a FROM clause. When would the current behaviour ever be desirable?

This bug isn't limited to WHERE conditions, it will occur if you reference a table in the SELECT portion. It will also occur if you use an alias in the FROM clause but fail to use it elsewhere.

I could only find a couple of other references to it:

http://www.scribd.com/doc/78503320/43/Answer (page 93)

http://itpro420.blogspot.com.au/2010/07/teradata-optimizer-anomaly-to-beware-of.html

Has anyone else on these forums encountered this issue? How does one go about requesting that it be fixed?

2 REPLIES
Senior Apprentice

Re: Accidental Cartesian Join Bug

This is a well known artifact, it's usually encoutered when you forget to use an alias name.

Teradata was implemented before there was Standard SQL, the initial query language was called (AFAIK) TEQUEL (TEradata QUEry Language), whose syntax didn't require to list tables within FROM.

A simple "RETRIEVE Adresses.EndDate" carried enough information for the Parser to resolve tablename and columnname (RETRIEVE is still allowed in BTEQ, only SQL Assistant complains about "invalid query" and refuses to submit it).

When they switched to Standard SQL this old syntax was not removed, IMHO there should be a switch to disable it.

Dieter

Enthusiast

Re: Accidental Cartesian Join Bug

Are there any plans to fix this "feature" or provide a way to disable it as Dieter suggests?