Aster Environments: Part 1: Development - Test - Production When do I need Separate Aster Environments

Learn Data Science
Teradata Employee

Many clients have asked me over the years, John, when do I need different Aster environments?  I started with a Discovery environment and then we had a project become repeatable and under service level agreements, what do I do?

These are great questions and great problems to have.  What it means is that an organization has realized value out of its Aster environment and has operationalized an analytic!  Do you need to have three separate environments?  Maybe you do.  It depends on your organizations standards and practices of course.  I have attempted to answer some of the process questions within another blog post.  You can read that by clicking here:  Aster Environments Whitepaper.   What I didn't answer was the technology solutions that are required.

This blog post will focus on the pro's and con's of having a single, dual, and triple Aster environment for development, test and production.

First things first.  Lets discuss the difference between discovery and production.  From my simple point of view there are two types of projects:


  • Exploratory Discovery: a one time project that may or may not be repeated or produced in the future.
  • Perpetual Discovery: a repeatable analytic project with service level agreements.  This means that analytic models are constantly tweaked, improved, or redesigned based on model decay.


How are these projects different:


Exploratory:

  • Project Governance:  Rapid Analytic Development project that may not evolve into a repeatable production project.
  • Service Level Agreements:  Service level agreements are relaxed or non existent.
  • Infrastructure and IT Ops Process:  Discovery platform with relaxed standards and processes.


Perpetual:

  • Project Governance:  PMO driven project with tight standards and practices that enterprise relies on for a business purpose.
  • Service Level Agreements:  There are defined service level agreements for inbound and outbound data/analytic products.
  • Infrastructure and IT Ops Process:  Infrastructure and process are robust to provide adherence to service level agreements.


The following image will sum up how this looks across the three types of environments:


three.JPG

The diagram above denotes the types of behaviors I will have on the three types of environments.  This is a logical representation and not physical.  The next blog post will go into how the physical environments could look like.  For now lets look at all three:

Discovery:  This is the area for true data science and what if type of analysis.  It allow for greater flexibility, ad-hoc activities, relaxation of standards, and relaxed SDLC.  Now it also has relaxed security; but I warn you, if you have critical and sensitive data that gets locked down.

Production:  This area is for tight control.  Control over data models, software release management, DBA process, tight security, scheduled activities, and strict standards.  There are no ad-hoc activities running on this platform and everything is deliberate and scheduled.  This allows me to size the environment for the workload. Typically starting out my discovery environment will be much larger than my production environment because my discovery environment has more random and potentially unknown workloads.   This is almost counter to everything you have heard about production environments which typically are the larger of the three.  As a developer I will tell you that you should have a very robust production environment for any project of any type.  However, where most organizations fail to understand is that all development environments are really discovery environments and they require a much more robust environment than production.   MIND BLOW.  Why do you have some of your most expensive people developing solutions on under powered hardware?

Test and Development:  These environments support software release management and source control for production environments.   Like production environments they have SDLC adherence, standards and practices, tight DBA control.  They support software promotion activities.

In the next blog post:  Part 2 we will discuss the three physical environments to support: Dev-Test-Production and discovery.