Data Science - Assoc. Mining & Circle of Similarity - Part 1

Learn Aster
Teradata Employee

Amazon popularized associative mining with the features in their website many years ago:

  • Customers who bought A also bought B
  • Customers who viewed this product also viewed that product

The underlying technique was to measure co-occurrence and/or co-viewing - to find out how customers view or buy things together. This in turn can be used to build  recommendation engines. Associative mining is extremely powerful to find insights into consumer behavior by looking at historical data.

SCENARIO #1: While it's great to find out that "People who buy chips often buy salsa", one can see why it would be problematic. Especially when a website recommends salsa to every  user who buys chips. The purchase history generated will dominate the recommendations - which can make it into a self fulling prophecy reinforcing the past behavior into future buys. Not good for other products that we want to sell with chips!

One way that's worked around is to provide #2, #3, #4, ... products that are sold with chips and round robin them with some 'weighted sum' method or a better formula. Now the recommendations are not stale, but there is still a problem of 'personalization' !! If we had 8 different products that were sold with chips most in the last few months, which should a website recommend first or create the next best offer ? How can the website know for sure that some other product would not be a better candidate than the Top N products sold with chips ?

SCENARIO #2: This is based on a idea that came out of Stanford/Twitter - 'Who to follow'. An algorithm that recommends a twitter user with a 'People you may want to follow' list based on whom your friends are following. Let's say we are able to find out how users are similar to each other with some algorithm. We find out that user1 is *MOST* similar to user2, user3, user4 and user5 because all of them have bought products in the same way (not just chips) or do a bunch of things in common. User1 just put chips in the shopping cart in the website.  If we are able to find out what user2, user3, user4 and user5  (who are similar to user1) bought along with chips, then it's likely that we may get a better answer than salsa or other globally rated products !!!  May be a 'fancy guac dip' which people like user1 buy often ??  Recommendations made by consulting the 'circle of similarity' happen to be more personalized and relevant and lead to better CTR and eventually conversions is the hypothesis. Read the continuation of this concept in Part 2 of the blog post.

    • For Global recommendations using a Collaborative Filter algorithm, see John Thuma's video tutorial that explains this succinctly ...



    • Personalized Recommendations using Personalized SALSA (not a pun ). SALSA stands for Stochastic Approach for Link Structured Analysis. PSALSA is implemented in Aster by leveraging the Graph Engine and part of the Aster foundation. The algorithm finds the circle of similarity for each user by looking at common purchases and builds what is known as a 'soft cluster' among users  ...

So which algorithm to use ? Personalized recommendations are good if the recommendation engine can find similar users with confidence. If not, one can always fall back to Global recommendations !

Global and personalized recommendations doesn't have to be on retail data. One can create recommendations around financial, communications, manufacturing, health care data and provide interesting suggestions to users for navigating products and services. The algorithms are also helpful in understanding insights into customer purchasing behavior or product usage which can increase profitability!

Disclaimer: I don't mean to trivialize the design of a recommendation engine  with just the two algorithms above. There are 100+ ways to recommend products beyond affinity or circle of similarity methods. As a marketeer one could be using margins, inventory, seasonality, search keywords, competition, promotions, past buys etc., However that said, these algorithms provide sound models from a purely data driven approach which one can always tweak with creativity - not to mention super scalability across of billions of rows of transactions and millions of customers ...