Aster Analytics 6.20 - What's New in this Release!

Learn Data Science
Teradata Employee

Some of you might be interested to know what is in the Aster 6.2 upgrade so I thought I would share!

UI Enhancements

Statistics:

  • Outlier function provided the data that was filtered without any outlier data. Now it’s been enhanced to provide both filtered and outlier data. Now we provide a new method of detecting outliers called the MAD (Median Absolute Deviation) median method.
  • GLM– Now allows backwards stepwise fitting, which is another method of determining importance of input variables in predicting an outcome. We also allow intercepts to refine predictions.
  • Decision trees – Single decision trees can now allow for categorical input variables. Random Forrest outputs publish confidence intervals and variable importance.
  • Canopy – Previously this had a Java interface, but now is consistent with SQL-MR function interface.
  • K-means – We now publish the points and associated centroids as part of the cluster output.
  • Gaussian Mixture Model – This is a new clustering technique where points lie in multiple customers, potentially identifying a new cluster. For example: points belonging to “sports parent” and “empty nesters” can be a new segment called “sports grandparents”.

Enhancements to Existing Methods

Association Mining:

  • FPGrowth and Frequent Pattern Growth are two new output metrics for Collaborative filtering.

Text analytics enhancements:

  • Naive Bayes Text function interface has enhanced to include a model table.
  • Leverage machine learning based approach for identity match
  • Advanced rules are supported for text tagging function

Graph analytics enhancements:

  • Improved performance of the Modularity algorithm
  • Gtree function creates a parent-child hierarchy of data sets via the graph engine

New analytic functions:

Interpolator – Standard time series data transformation function to interpolate and aggregate time series for different intervals. It uses partition splitter which can partition time series for parallel processing. Burst function does the same for categorical values.

ARIMA - A class of time series techniques that are used mainly to forecast values in longitudinal data. There are other techniques such as regression models that have also been used to predict future values. The difference between and the difference between regression and ARIMA is that in the former the dependent variable’s value is predicted by a combination of the values and their co-efficient of other independent variables (e.g., the price of sugar is determined by yield per acre, rainfall amount, transportation costs, storage costs etc.). ARIMA, however, uses the properties of a time series itself to determine what future values would look like in that time series (e.g., if the stock price is 110, 125, 236, 179, 182, a new value of 197 can be predicted into the future without considering other variables).

• ARIMA models are used in several retail scenarios such as sales demand forecasting to predict customer demand based on time and inventory forecasting to avoid shortfalls or surplus. ARIMA can also be used to determine resource (e.g.: manpower or utilities) forecasting based on purchase patterns.

• Financial organizations use ARIMA for price forecasting of financial products or to understand individual stock price movements to predict future value.

• Manufacturing sector can forecast utilization of resources. Energy consumption can be forecasted based on factors that change with time. Forecasting using historical data can be now done at scale using Aster.

Change point detection – Function to detect the change points on a stochastic process or time series.

• Manufacturers use change point detection to identify anomalies in readings from sensor data generated throughout the manufacturing processes. As anomalies are detected, immediate action can be taken to mitigate risk and cost that result from defective parts.

• IT departments in most organizations can use change point detection to identify anomalous patterns in server logs. This can help identify any new issues that arise due to unpredictable usages.

• Marketing departments can use change point detection to identify changes in customer’s online behavior due to life events or other reasons.

COX-PH – Cox Proportional Hazard function to investigate the effect of several variables upon the time a specified event takes to happen.

• Use cases like failure time analysis for biomedical devices industry and automobile industry, time to failure or time to service the robots in a manufacturing process.

• Clinical trial analysis for Pharma for predicting new drug effectiveness based on trial data.

• Marketing attribution with survival analysis to predict attributes to conversion based on lifetime of clickstream events leading to conversion.

• Customer lifetime value prediction with probability of conversion.

BENEFITS FOR YOU:

Benefits of this release can be summarized in two points: New Time Series features and user Interface improvements.

1) New Time Series algorithms: With Teradata Aster Analytic Portfolio release 6.20, Aster provides a comprehensive set of time series analytics including forecasting, anomaly detection, survival analysis, and interpolation and extrapolation methods in addition to the existing time series classification methods. With the new analytics, customers can enhance their analytics by incorporating time interval and sequencing to better predict outcome.

2) User Interface Improvements: Usability enhancements have made Aster analytics Portfolio easier to install and manage access control. In addition, we’ve provided consistency of error messages for analytics arguments.

COMPATIBILITY:

Aster Analytics 6.20 is compatible with both Aster database 6.10 and 6.00, however previous releases are not supported.

Canopy function has a new SQL-MR like interface which is now consistent and will not be backward compatible

The new time series functions interpolator, burst, ARIMA, CoxPH, Change point detection will be part of Analytics Base package.