Just got started with Teradata and my first post here.
I need to create a table and load data from file then on multiple select statements would be run with OLAP functions using partition by on first four columns, Just wondering if its better to partition the table data over the first four columns ?
My guess here is, if OLAP function with partittion by is used and if records of same partition are located on multiple amps, inter amp communication is required which could be avoided if all records of the group exists on one amp.
Does this assumption make sense ?
each partition in a PPI table is still distributed across all AMPs. But newer releases of Teradata will do the STAT steps with a spool "built locally" when the PI is part of the PARTITION BY.
So simply try a PI on the first four columns and check Explain.