THIS IS A CUSTOM SQL-MR FUNCTION AND IS NOT SUPPORTED BY TERADATA ENGINEERING, CLIENT SUPPORT, OR THE FIELD. IF YOU FIND A BUG, PLEASE LET ME KNOW AND I WILL TRY TO FIX IT AS SOON AS POSSIBLE.
For an Aster PoV in a cable TV customer, I have developed some custom SQL-MR functions that could help to handle some datasets. Some of these functions were created to be reuseable but for a single purpose. However, we have found them very useful for several cases, so I have decided to share some of them.
Find attached BooleanPivotMatrix SQL-MR function (partition function like a reduce). This function is similar to CumulativePivotMatrix custom SQL-MR and allows you to pivot information by partition and category (detail level data) to a table where you get a matrix of all the partitions (e.g. customers, users..) and if the category appears (1) or not (0). Function allows up to 1597 columns.
Find attached a .rar file that includes:
Function jar file
Step by step first demo
The original goal of the function was to get a matrix of TV hours watched by customer and channel or categories:
Demo includes a dataset to help understand how it works. We have found it very useful for data mining, segmentations, minhash reverse engineering...
If you need further information or find a bug let me know.