GraphGen Redux: Moving Beyond Modularity for Node Color

Aster Field Strong
Teradata Employee

GraphGen was put on a shelf to collect dust when the "official" Visualizer function was released. Those of us who believed in it continued to use it. As it is now clear that innovation, development and support for innovative technology initiatives like this must often be field-based, I have resurrected GraphGen to add a new feature.

Myself and others have clamored for literally years now for the ability to color nodes based on values other than modularity class. While there is still a bit of additional work to be done, I feel comfortable releasing this new version of GraphGen that includes this new Sigma graph functionality. It is possible there are still some kinks to be ironed out and I welcome feedback.


Please allow me to provide a couple of examples of where this new capability will be useful:


1. Market Basket Analysis


The Aster AppCenter "Retail - Market Basket & Product Recommender" app runs a collaborative-filter based market basket analysis to figure out which items are most frequently purchased with other items, i.e., product affinity. The resulting graph visualization appears as such:



In this visualization, the nodes are colored according to "modularity class". GraphGen has used the Gephi library to run the modularity clustering algorithm and assign a modularity class value to each node. This is incredibly useful as we seek to understand how the grocery items are clustered, but other associated (dimension) values are likely interesting as well.


For example, now that we have seen which items are clustered together based on how frequently they are bought together, how about contrasting that with where they are located in the store. In other words, is there an immediate noticeable correlation between which items are most frequently purchased together and in which aisle they have been stocked?



You can see in the screenshot above that we have selected "aisle" from the Node Color pull-down menu and we can now see which aisle is associated with each item. Additionally, we can flip over to the "Layout" tab to see the Legend which will show us which color is associated with which aisle number.



This type of analysis can likely tell us whether it might make sense to arrange items differently in the store to either help shoppers complete their trip faster OR drive shoppers to different parts of the store.


2. Telco Call Networks


The Aster AppCenter "Telco - Network Finder" app visualizes which callers are most frequently speaking with one another. By constructing these networks, we hopefully can gain a better understanding of social influencers and trending through our customer population. The resulting graph visualization appears as such:



In this visualization, the nodes are again colored according to "modularity class." We are able to see which customers are speaking with one another, but again, there are other associated variables that might be interesting and revealing as we drill-down further into these clusters. For example, do all of the customers is a specific cluster live in the same state? Are they similar in age? Here we select phone "make" to see whether customers in a given cluster are using the same make of phone.



And with the legend for clarification:



Perhaps customers who own Apple phones are more likely to speak with one another. Or perhaps it is those who own Nokia phones that are tighter. Or perhaps it makes no difference at all... It is the ability to test for this hypothesis in seconds rather than minutes or hours that makes this feature a valuable add-on to the GraphGen Sigma graph visualization.



How does it work?

To use this new node coloration functionality, simply use the multi-input capability of GraphGen. You can read all about this in the GraphGen documentation. Any field that is included in the dimension table will be accessible via the Node Color drop-down menu.



Where do I get the latest version?

Please speak to your friendly, neighborhood Teradata person.