SQL MR: CumulativePivotMatrix

Aster Field Strong
Teradata Employee

THIS IS A CUSTOM SQL MR FUNCTION AND IS NOT SUPPORTED BY TERADATA ENGINEERING, CLIENT SUPPORT, OR THE FIELD. IF YOU FIND A BUG, PLEASE LET ME KNOW AND I WILL TRY TO FIX IT AS SOON AS POSSIBLE.

For an Aster PoV in a cable TV customer, I have developed some custom SQL-MR functions that could help to handle some datasets. Some of these functions were created to be reuseable but for a single purpose. However, we have found them very useful for several cases, so I have decided to share some of them.

Find attached CumulativePivotMatrix SQL-MR function (partition function like a reduce). This function allows you to pivot information by partition, category and value (detail level data) to a table where you get a matrix of all the partitions (e.g. customers, users..) and the added value grouped by categories. Function allows up to 1597 columns.

Find attached a .rar file that includes:

  • Function jar file
  • User manual
  • Summary slide
  • Step by step first demo

The original goal of the function was to get a matrix of TV hours watched by customer and channel or categories:

summary.JPG

Demo includes a dataset to help understand how it works. We have found it very useful for data mining, segmentations, text analitycs combined with ngrams...

If you need further information or find a bug let me know.

Regards,

Ignacio