1. Ensure statistics are collected on join columns - This is especially imporatant if the column you are joining on are not unique. If your join column only has a few hundred unique values, TD might redistribute the table on those columns, and duplicate it on all AMPs. This could take an extremely long time when you have a large table. If it's a small table, then a product join is usually the least expensive.
2. Make sure you are referencing the correct alias (if you have one). The query below will result in a product join (because you are joining a to a, and nothing to b)
select a.column1 from mytable1 a inner join mytable2 b on a.column1 = a.column1
Also, if you have an alias, you must always reference it instead of a fully qualified tablename. The query below will result in a product join:
select mytable1.column1 from mytable1 a inner join mytable2 b on a.column1 = a.column1 ;
3. Sometimes product joins happen for a good reason. Joining a small table (100 rows) to a large table (1 million rows) a product join does make sense.
If you post your query, then we might be able to pick it apart, but without knowing the underlying data it could prove difficult.....still it wouldn't hurt to have another pair of eyes on it.
I you follow the explain plan, you will see that Teradata is reading the small table first. This table is then duplicated (copied) to all amps. Now the big table and the small (duplicated) table are joined with a product join. Eg. A small table with 100 rows (on 50 amps). The result after duplication are 50 tables with 100 rows each (50 * 100 = 5000 rows). This result (spool) is joined with your big table by using a product join.
This kind of ‘product join’ is very fast and different from product joins when statistics are not up to date or when the correct join condition is not added (well) in your query.