Data Science - Hidden Markov & the Elusive Genie

Learn Data Science
Teradata Employee

Algorithm Intuition

I work out of my home when I don't travel. I'm one of the few dads who get to watch their kids come back from school for an entire week. I also have the privilege to get to notice if they are sad, upset, happy, excited, tired or angry when they come home. If your kids are middle schoolers, they probably won't tell me a thing if you ask 'How was school ?' or 'What happened today ?'. You probably would get the standard => "Nothing!" . Experienced parents generally know there are a few dominant reasons why middle schoolers exhibit one of those moods when they are back home. Here we go (really - thanks to my wife ...) - most common ones below:

- Funny teacher, PT class(pretty intense), Going to a common class with a friend, New Math homework, Argument with a frenemy, ...

For simplicity assume, they are only allowed one "dominant" situation above each day. There is also some pattern behind how these "dominant" situations change each day. May be a funny teacher a day, followed by New Math homework next day etc.,. To make it more interesting, assume each one of the situations can generate any of the moods with different ratios. Example(s):

  • A Funny Teacher can produce an excited mood 50% of the time, 49% of time happy mood and 1% of the time the rest of the moods.
  • Going to a common class with a friend gets your kid get excited 90% of the time, 10% happy.
  • PT class gets your kid either tired 30% or happy 70% or whatever.
  • Math Homework -> 70% upset and 30% tired !


Remember, your kid will NOT tell you what's going on - you ONLY get to observe their moods over time. Sample questions that I want to ask with observable data:


  • Can I infer what mood my kid will exhibit today or tomorrow when the door flings open given what moods I was seeing the last X days ?
  • What are the chances that the kid will be happy 2 days in a row or sad 2 days in a row ? How about sad, happy, excited in that order ?
  • How many dominant factors exist that I don't see and what changes from day to day among those latent factors ? How does each one of those latent factors contribute to the mood ?

Business & Scientific use cases

One can think of a lot of situations & questions like this in business & science where an Analytic Professional or Data Scientist need to make decisions or predictions ONLY on observable events which is basically incomplete data - you never see the hidden events that drive these events ! Some examples:

  • Churn/Loyalty/Spending models by observing browsing behavior
  • Product "stickiness" based on usage
  • Shopping cart checkout prediction based on path taken on the website
  • If a customer went to page1, page2 and page3 where would he go next ?
  • DNA sequence prediction (or gene finding), finding secondary sequences, RNA folding etc., in computational biology
  • Speech & image recognition

In all the above cases, most times one may never get to directly see or understand the LATENT or HIDDEN reasons that drives the observable events, yet need to provide "sound & logical" recommendations sometimes by looking at ALL of BIG DATA !!! If you got this far, you've understood already what HIDDEN MARKOV MODEL or HMM solves. HMM was first introduced in a white paper with speech recognition as an application:

http://www.cs.ubc.ca/~murphyk/Bayes/rabiner.pdf

Today there are a lot of HMM usage increasingly emerging around business use cases with time series data.

Teradata recently announced Hidden Markov Model capability with the Aster Discovery Platform as part of the Connection Analytics Suite

The HMM algorithm is implemented on Aster's native Parallel/ Scalable Graph Engine.This Graph processing algorithm implementation is very different than the Apache Mahout version (Hadoop) which uses Map/Reduce. One can now build churn/loyalty/spending models etc., on a ridiculous billion+ time series event data set on weblogs etc., and start doing cool sequence prediction and HMM parameter estimation in Aster! You can now build HMM models around customer, sessions etc., whatever and start scoring event sequences for churn or conversion likelihood, loyalty etc.,

Interesting Factoid - Side bar on HMM: It's easy to jump to the conclusion that the human brain somehow uses HMM to predict events or event sequences based on observations and guess the unseen. However HMM untangling is one of those cases where machines does easily well than the brain (quoting this from a white paper). The brain uses a very different technique than HMM to solve sequence prediction, speech recognition etc., (you probably guessed it - neural ...), so I wouldn't want to try to do HMM with my head

So what's with the Elusive Genie ?

The traditional explanation of HMM (from Rabiner's paper originally) is how you've a closed room with X urns. Each urn can hold a mix of red, blue, green etc., balls. A Genie chooses an urn randomly each time, picks a ball and puts it on a conveyor belt. Each urn can have a completely different distribution of the colored balls. Some may have more red, others blue etc., As an observer you only get to see the colored balls sequentially come out off the belt. You've no idea how many urns are there in the closed room or which urn the colored balls could have come from !! Hidden Markov Model is all about inferring the # of urns in the closed room, the distribution of colored balls in each of the urns and also about getting the pattern in which the Genie changes its mind about the urn each time to pick a ball ... - all just by looking at the conveyor belt for a while and applying HMM algorithms .... - Wow

Links to "Fire hose Math" for the inquisitive Data Scientist/Explorer:

Wikipedia:

http://en.wikipedia.org/wiki/Hidden_Markov_model

From Quora:

http://www.quora.com/What-are-some-good-resources-for-learning-about-Hidden-Markov-Models

and Khan Academy:

https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/markov_c...

1 Comment
Teradata Employee

Reposted from Linkedin