Whats new in Aster Analytics 6.21 - This is a Significant Update to Aster

Learn Data Science
Teradata Employee

Please take note that the spring release of Aster Analytics 6.21 is significant.  There is a lot of new functionality being introduced from the analytic perspective.  There are a variety of new functions and changes to existing functions in this release as well as a real time scoring SDK. 

In this blog we will focus on the new features and analytics in  release of Aster Analytics 6.21.

Requirements:

Aster Database 6.10.02 (AD6.10.02) or later

Aster Scoring SDK:

Aster Scoring SDK operates in your runtime environment on analytic models built in the Aster database. With Aster Scoring SDK, you can build complex analytic models using multigenre

Aster Analytics functions at Multiple Processor Platform scale and export them to realtime or near-real-time environments, where you can score incoming data using real-time streaming platforms or databases that support the Java Virtual Machine framework.

New Analytics:

TIME SERIES, PATH, and ATTRIBUTION ANALYSIS

Convergent Cross-Mapping

  • CCMPrepare. The CCMPrepare function adds a new partition column and partitions the data to prepare it for use with the CCM function.
  • CCM. The CCM function tests multiple causes and effects simultaneously, reporting an effect size for each cause-effect pair.

Shapelet Functions

  • UnsupervisedShapelet. The UnsupervisedShapelet function takes a set of time series and assigns them to clusters, based on the shapelets that it finds.
  • SupervisedShapeletTrainer. The SupervisedShapeletTrainer function takes a set of classified time series and outputs a model for classifying time series, based on the shapelets that it finds.
  • SupervisedShapeletClassifier. The SupervisedShapeletClassifier function takes a set of time series and assigns them to clusters, based on the model output by SupervisedShapeletTrainer.

VARMAX. The VARMAX function extends the ARMA/ARIMA model to work with time series with multiple response variables (vector time series), as well as exogenous variables, or variables that are independent of the other variables in the system.

STATISTICAL ANALYSIS

CrossValidation. The CrossValidation function validates a model by assessing how the results of a statistical analysis will generalize to an independent data set.

DenseSVM Functions

  • DenseSVMTrainer. The DenseSVMTrainer function takes training data and builds a predictive linear or nonlinear model in binary format.
  • DenseSVMPredictor. The DenseSVMPredictor function uses the model to predict the class of each sample in a test data set.
  • DenseSVMModelPrinter. The DenseSVMModelPrinter function displays readable information about the model.

PCAPlot   The PCAPlot function takes the principal components output by the Principal Component Analysis (PCA) function and input data, changes the basis of the input data to the principal components, and outputs the result.

RandomSample   The RandomSample function takes a data set and uses a specified sampling method to output one or more random samples. Each sample has exactly the number of rows specified. The RandomSample function is useful for generating test sets, training sets, and initial centers for clustering algorithms.

CLUSTER ANALYSIS

  • KModes  The KModes function is an extension of the KMeans function that supports categorical data. KModes models are fit similarly to KMeans models. The core algorithm is an expectation-maximization algorithm that finds a locally optimal solution.
  • KModesPredict. The KModesPredict function is the prediction function that corresponds to the KModes function.

NEURAL NETWORKS

  • NeuralNet   The NeuralNet function uses backpropagation to train neural networks. The user must provide input data, along with other argument settings for training the networks, and the fitted weights of the neural network will be created. The Neural Net function is optimized for performance on very large datasets (millions of rows).
  • NeuralNetPredict  The NeuralNetPredict function predicts the output for specific arbitrary covariate inputs, using a particular trained neural network output weight table.

ENSEMBLE METHODS

  • AdaBoost_Drive. The AdaBoost_Drive function takes a training data set and a single decision tree and uses adaptive boosting to produce a strong classifying model.
  • AdaBoost_Predict. The AdaBoost_Predict applies the strong classifying model output by AdaBoost_Drive to a new data set.

ASSOCIATION ANALYSIS

KNNRecommenderTrain and KNNRecommenderPredict. The KNNRecommenderTrain and KNNRecommenderPredict functions take a similar approach to WSRecommender, but attempt to increase prediction accuracy by adjusting for systematic biases and replacing heuristic calculations of similarity coefficients with a global optimization that simultaneously estimates all weights.

Aster Scoring SDK

AMLGenerator. The AMLGenerator function transforms model data from Aster to an XML-based AML (Aster Model Language) format that is compatible with the real-time functionality.

• Scorer. The Scorer function provides a software framework to score input queries based on a given model and predictor. The following real-time functions are currently supported by the SDK:

• Aster Scoring SDK CoxPH

• Aster Scoring SDK Extract Sentiment

• Aster Scoring SDK Generalized Linear Model

• Aster Scoring SDK LDAInference

• Aster Scoring SDK Naïve Bayes

• Aster Scoring SDK Naïve Bayes Text Classifier

• Aster Scoring SDK Random Forest

• Aster Scoring SDK Single Decision Tree

• Aster Scoring SDK SparseSVM

• Aster Scoring SDK Text Parser

• Aster Scoring SDK Text Tagging

• Aster Scoring SDK Text Tokenizer

Updates to Existing Functions

• ApproximatePercentile (accepts multiple target columns as input)

• ConfusionMatrix (includes the functionality that ConfusionMatrixPlot provided in earlier releases, and ConfusionMatrixPlot is deprecated)

• Correlation (new GroupByColumns argument)

• CoxPH (new Categorical_Columns argument, new summary table column category)

• CoxSurvFit (accepts categorical models)

• CoxPredict (accepts categorical models)

• FMeasure (new Classes argument)

• Forest_Drive (added seed control and ability to control the number of splits)

• Forest_Predict (specifying predictor variables is now optional, because the function can get them from the model; new Accumulate argument) GLMPredict (Family argument is now optional)

• GMMPredict (new Accumulate argument)

• Histogram (single function replaces Histogram_Map, Histogram_Reduce and the Enhanced Histogram functions Hist_Map, Hist_Reduce, and Hist_Prep)

• OutlierFilter (filters multiple columns, using either the same method for all columns or a specified method for each column)

• Principal Component Analysis (PCA) (new output columns var_proportion and cumulative_var)

• Single_Tree_Drive (new Weighted and WeightColumn arguments, new optional weight column in response table, new alternate single-table format)

• Unpack (new ColumnLength argument)

Other Mentions

• The installation instructions in the Aster Analytics Foundation User Guide have been updated.

• Most of the function examples in the Aster Analytics Foundation User Guide have been updated.

• The Logistic Predict function is deprecated; use the GLMPredict function.

• The Logistic Regression function is deprecated; use the GLM function.

• The SAX function is deprecated; use the SAX2 function.

• The Shapelet functions ShapeletMasker, ShapeletFrequencyFinder, ShapeletStrengthFinder, ShapeletFinder, and ShapeletClassifier are deprecated; use the new Shapelet functions SupervisedShapeletTrainer and SupervisedShapeletClassifier.

• The Visualization functions cFilterviz, nPathviz, and Visualizer are deprecated. However, this functionality is available with the Aster AppCenter product. For information about obtaining Aster AppCenter, contact your Teradata Aster Account Representative.