5 Reasons Why Aster is Great for Machine Learning

Learn Data Science
Teradata Employee

Happy New Year and welcome 2016!

Many of you have read Karthik Guruswamy's articles within the Aster Community about Machine Learning and they are among the best!  I would like to memorialize what I have learned from his articles and from my own experiences.  I also want to highlight why Aster is a great platform for Machine Learning.  I am going to give you five reasons why and I am sure there are more and I welcome the input.  Machine learning is a very powerful predictive methodology that can be supervised(trained) or unsupervised.  It is used for predicting outcomes or classification.  LInk to Karthik's blogs: (KARTHIK)

1.  Use More Data/Use More Sources of Data:  With more data I don't have to sample.  When I sample I can make huge mistakes on ensuring my sample represents the actual data population.  Aster allows you to scale to very large data sets across many different channels or sources.  I can look at data from traditional systems: sales and CRM systems - Transactions.  I can also look at data from the behavioral perspective: clickstream, IVR, and call center notes - interactions.  Having more data from a variety of sources at scale allows me to not have to make assumptions due to sampling.  This increases the likelihood of success.

2.  Feature Creation/Normalization/Selection:  Aster allows you to improve accuracy by being able to explain variance in a data set.  By being able to explain variance in my data set I am able to understand what variables could be correlated to be able to create new features.  I am also able to quickly normalize my variables as well.  I have the full power of ANSI SQL, Aster SQL-MR/GR at my disposal.  There is an excellent blog post by Greg Bethardy on GLM and Feature Selection.  Link to Greg's blogs: (GREG)

3.  Multi-Genre Analytic Algorithms:  Aster comes with many types of analytic algorithms.  Aster enables you to: perform data transformation, graph, statistics, classification, regression, dimension reduction, clustering, text, machine learning, and many more.  The form factor for these genre types are consistent and easy to learn.  We have removed the syntax and technology barrier to implement these algorithms.  With Aster I don't have to move my data to a different tool to take advantage of these methods and I don't have to learn a new form factor or syntax. For more information about this please check out: (Aster Multi-Genre Analytic 101 videos)

4.  Tool Bagging:  With Aster I can use data from a variety of other tools including SAS, R, SPSS, and others.  Aster allows you to take outputs from other statistical and analytic packages and combine, test, and tune those results.  This enables me to test the accuracy of my models created in Aster but also allows me to combine them to form a better answer.  For instance did I find all of the potential churners in my SAS model in Aster?  What new churners did I produce in Aster that I was not able to find in SAS?  With AsterR you are able to get beyond the limitations of traditional R with respect to scale and performance.  For more information about AsterR please check out some great blogs by Roger Fried(ROGER)

5. Check for Accuracy:  Aster allows you to look for false positives, false negatives, and review model metadata to understand accuracy and precision of models as they are built and as they decay over time.  A model is never one and done!  They require constant tuning.  There are organic and inorganic catalysts that impact your models.  You also may have not selected the right algorithm to construct your models.  So it is important to be able to test and tune for best approach.  Aster comes with confusion matrix functions to allow you to test for accuracy and precision.  For more information about this please check out "Dealing with False Positives and False Negatives in Machine Learning"

I am sure there are five more qualities that make up a good predictive model so please feel free to contact me or provide feedback.