Getting to Know Teradata Warehouse Miner

Tools
Tools covers the tools and utilities you use to work with Teradata and its supporting ecosystem. You'll find information on everything from the Teradata Eclipse plug-in to load/extract tools.
Teradata Employee

Getting to Know Teradata Warehouse Miner

Once you've installed the Evaluation Copy of Teradata Warehouse Miner (described in the article "Getting Started Installing and Configuring Teradata Warehouse Miner"), there are now two ways to get to know Teradata Warehouse Miner.  First, you can get started quickly by loading and viewing a supplied tutorial project.  Second, you can learn how to build a project from scratch, with specific examples in on the following:

  • Create a Histogram Analysis
  • Create a Scatter Plot Diagram
  • Create a Decision Tree Model
  • Create a Decision Tree Score Table

Step Through an Existing Project

  • Click on the Add Existing Project icon on the toolbar or select File >> Add Existing Project… from the main menu.
  • Select one of the supplied Help Tutorials projects or Supplemental Tutorials projects by double-clicking on it or by highlighting it and clicking on the OK button.
  • Once the selected project has been added to the Project Work Space at the right side of the application form, you may view an analysis by double-clicking on it or by right-clicking and selecting the View option at the top of the right-click menu.
  • To execute an individual analysis, select the analysis in the Project Work Space and click on the Run button on the toolbar or right-click on the analysis and select the Run option on the right-click menu. (You may also simply hit the F5 button on your keyboard after selecting the analysis to execute.)
  • To execute the entire project, select the project in the Project Work Space and click on the Run button on the toolbar or right-click on the project and select the Run option on the right-click menu. (You may also simply hit the F5 button on your keyboard after selecting the project to execute.)
  • To view the results of an individual analysis, view the analysis and click on the RESULTS menu option near the top of the analysis window. Depending on the type of analysis, you may then select data, graph or sql on the sub-menu. 

Create a New Project

To create a new project and step through the demo below, click on the Add New Project icon on the toolbar or select File >> Add New Project on the main menu.

Create a Histogram Analysis:

A Histogram analysis is designed to study the distribution of continuous numeric values in columns. It’s often referred to as “binning” because it counts the occurrence of values in a series of numeric ranges called “bins”.

  • Click on the Add New Analysis icon in the toolbar, or from the menu item Project select “Add New Analysis” and then select category “Descriptive Statistics”.
  • Next click on the Histogram icon in the Add New Analysis window. Then click OK.
  • In the “Histogram 1 – Histogram” window click the pull-down select button to the right of “Available Tables” and select “twm_customer_analysis”.
  • In “Available Columns” select “income” and add it (using the “right-arrow”) to the “Histogram Columns”.
  • Select the “city_name” and add it to the “Overlay Columns”.
  • Select “age” and add it to the “Statistics Columns”.

 

  • Select the “expert options” tab. Type state_code in ('CA' or 'NV') under “Optional WHERE clause text:”.
  • Now run the analysis by hitting “F5” or by clicking the Run button on the toolbar.
  • After a short time it will run the query and put the results under the “Results” tab.

  • Now click on the “Results” tab to expand the results. See below:

  • In the results window click on the “graph” item. Then click on the “graph options” tab.  Note: Maximize window and then select the “Show Overlay Counts” and “3D Graph” buttons.  Make sure that all data elements in the table are highlighted or you will only view the records that are selected. 
  • Click back on the “graph” tab to see the graph in 3D.
  • You can rotate the 3D graph by dragging the scroll bars.
  • Click on the “SQL” tab under RESULTS to see the SQL generated for the histogram analysis.
  • Close the Histogram analysis window.  (To close an analysis when maximized, click on the X just under the X in the upper right corner of the application window, being careful not to close the application!)

Create a Scatter Plot Diagram: 

Scatter plots are useful to identify relationships and outliers across two and/or three different variable combinations.

  • Click on the Add New Analysis icon in the toolbar, or from the menu item Project select “Add New Analysis” and then select category “Descriptive Statistics”.
  • Next click on the “Scatter Plot” icon in the Add New Analysis window, and then click “OK”.
  • In the analysis window “Scatter Plot1 - Scatter Plot”, click the pull-down button next to “Available Tables” and select the “twm_customer_analysis” table.
  • Select the three average balance variables “avg_cc_bal”, “avg_ck_bal” and “avg_sv_bal” and add then, using the “right-arrow”, select them into the “Selected Columns” list.
  • Now run the analysis by hitting “F5” or by clicking the Run button on the toolbar. 

  • When available, click on the “Results” tab to expand the results.
  • In the results window click on the “graph options” tab. Select the third variable and move into the “Selected Columns” list. Click back on the “graph” tab to see the graph in 3D.



 

 

  • You can rotate the 3D graph by dragging the scroll bars.
  • Close the Scatter Plot analysis window.

Create a Decision Tree Model:

Teradata Warehouse Miner provides decision trees for classification and regression models to predict an outcome (dependent variable) based on many predictors (independent variables). In this example, we’re going to use a decision tree analysis to predict credit card ownership.

  • From the menu item Project select “Add New Analysis” and then select category “Analytics”.
  • Next click on the “Decision Tree” icon in the Add New Analysis window and then click “OK”.
  • On the “Decision Tree1 - Decision Tree” input form, select the “twm_customer_analysis” table within the “Available Tables” list.
  • Select “ccacct” as the “Dependent Column”.
  • Select all except the customer ID and credit card related variables as independent variables. (That is, do not select: “cust_id”, “ccacct”, “avg_cc_bal”, “avg_cc_tran_amt”, “avg_cc_tran_cnt” or “cc_rev” as Independent Columns.) 

  • Select the “analysis parameters” tab. Select “Gain Ratio” as the “Splitting Method” and “Pruning Method” options. Also set “Minimum Split Count”: 2, “Maximum Nodes”: 1000, Maximum Depth”: 10, and “Bin Numeric Variables”: Disabled.  (Note that these are all default values except Maximum Depth. It should therefore be the only value that needs to be changed.)

  • Now run the analysis by hitting “F5” or by clicking the Run button on the toolbar.
  • You can follow the progress of the algorithm by watching the messages change in the Execution Status window at the bottom of the application window.  When the Status is "Complete", the RESULTS tab will be enabled.
  • Now click on the “RESULTS” tab to expand the results.
  • Next click on the “graphs” tab under the “RESULTS” tab.  You have the option of viewing the Tree Browser or Text Tree.  Click on both and view the results.

Create a Decision Tree Score Table:

After building a Decision Tree model with Teradata Warehouse Miner, you can create a score table based on that model.

  • From the menu item Project select “Add New Analysis” and then select category “Scoring”. 
  • Click on “Tree Scoring” and then on “OK”.
  • Under the “Available Tables” select “twm_customer_analysis”. The cust_id will automatically show up in Index Columns.
  • Select “Decision Tree1” under “Select Model Analysis” using the pull-down.
  • Select the “OUTPUT” tab and enter “dt_table” as the “Results Table Name”.

  • Now run the analysis by hitting “F5” or by clicking the Run button on the toolbar.
  • After a short time it will run the query and put the results under the “RESULTS” tab.
  • Now click on the “RESULTS” tab to expand the results.
  • Select “data” and then click “Load” to view results.

If you want to view the scoring SQL, now perform the following: 

  • Click the “OUTPUT” tab to return back to that window.
  • Now select “Generated the SQL for this analysis, but do not execute it.”
  • Next select the “INPUT” tab and the “analysis parameters” tab under that, and then select “Score” as the Scoring Method.
  • Now run the analysis again by hitting “F5” or by clicking the Run button on the toolbar.
  • When execution is complete, view the generated SQL in the results window by selecting “RESULTS” and then the “SQL” tab.

  • To exit the Teradata Warehouse Miner application, select “File->Exit” from the main menu.
  • Click “No” on the “Save Changes” dialog.
  • You’re Done!