The Teradata Agile Analytic Cloud allows business users and developers to easily allocate an elastic data mart (i.e. a database) inside the Teradata System. Using an on demand self service portlet (the Elastic Marts Builder portlet), the user can create a database, upload comma separated values (.CSV) data, and begin analysis – all within 5-10 minutes.
The beauty of this approach lies in co-locating the elastic mart in the same system as the EDW. An elastic mart user can upload their data and join it to the EDW data, producing new reports quickly. The uploaded data might come from a business partner, an operational system not yet modeled in the EDW, a data mining result, etc. The power user can then join the new elastic mart table to EDW tables to discover new insights. This allows the power user to quickly join unrefined data with refined EDW data, exploring for new value, without having to go through a lengthy data governance project requiring additional ETL processes to get their data into the EDW, or worse, costly data duplication by extracting the data into a separate data mart.
This elegant and simple solution consists of:
The Elastic Marts Builder portlet allows end users and developers to quickly create and use a data mart inside the Teradata system in a self service manner. They can populate their marts with data from external sources that are not yet ready to be promoted into the EDW. This article focuses predominantly on this capability.
The Teradata Data Warehouse is the source of refined, reliable data that can be used with the elastic mart. This is what circumvents the need for additional ETL processing and the labor that goes with it.
TASM or Priority Scheduler allows the DBA to manage the workload on the EDW of the elastic marts. Power users typically exploit elastic marts using more sophisticated queries that could affect performance of the overall system. By utilizing the features in TASM these power users can be supported while still avoiding adverse effects for the rest of the users and applications.
The Teradata Agile Analytics Services is a Professional Services engagement that was crafted a few years ago as Teradata experts grappled with exploratory sand box analysis, SAS, data mining, etc. Adding Elastic Marts Builder to the existing methodology was a natural fit. More importantly, these experts can help the users focus best practices for establishing a culture of discovery.
There are hundreds of short term projects for elastic marts in every industry such as proof of concept tests, sand box exploration, data mining hypotheses, hurry up projects, new BI tools release testing, etc. A few examples are:
The database administrator has to do some simple tasks to enable the Teradata Agile Analytic Cloud. The process is covered in detail in the Elastic Marts Builder Installation Guide, and you can get community support over at the Cloud Computing forum. In short, these tasks are:
Once the environment is set up, the business user can be trained to use the self service elastic mart portlet. First the business user must add the Elastic Marts Builder portlet to their Viewpoint home page:
The business user can use this portlet to create an Elastic Mart. Once created, the then cycles through these steps for each new data set they wish to use:
In the first step, the elastic mart is named, perm space is allocated, and the Teradata System UserID associated with the mart. The elastic mart size is selectable from the drop down list. Remember that the userID in Viewpoint doesn’t always match the UserID in the Teradata System data dictionary so the portlet cannot assume the Viewpoint UserID is the same as the mart owner.
In this example, we grant our own UserID “demo_td1” full permission to the “demo_sandbox” elastic mart.
Next a list of all the tables in your elastic mart is displayed. Initially, the page will display “No Data Uploaded” since no tables have been built yet.
Next, browse your PC hard drive select a CSV (comma separated value) formatted data file to upload into your elastic mart. It is assumed there are column headers in the CSV file already. Columns default to VARCHAR(255) format.
The first few rows of data are displayed so the user can see if the CSV file upload was successful. By scrolling to the right, the user can set the data type for each field. Assuming everything looks as it should, the user should select the data type for the columns at this time.
Once data types are established on all the columns, the user should identify the primary index columns and specify the table name.
If the user is sufficiently skilled, they can switch to SQL mode and make changes to the actual DDL. Usually this is done to adjust the data type or default of a column, for example changing the VARCHAR(255) to VARCHAR(500).
Once the create table button is pressed, the table is created and loaded with the CSV data. The business user can repeat this process several times, uploading several tables from CSV files into the elastic mart.
The elastic mart is now ready to do agile analytics. The business user can point their favorite BI Tools – such as MicroStrategy, Business Objects, Cognos, SQL Assistant, etc. -- at the mart and perform analysis. Or they can use the SQL Scratchpad portlet from within Viewpoint itself (if it has been installed by their administrator).
Further features of the portlet are the ability of the user to manage their elastic mart from within the portlet. They are able to export data to CSV files and deleting tables when they are finished with the data.
Since one of the concerns many DBAs have about allowing users to upload undefined data into the EDW is data proliferation, the Elastic Marts Builder portlet contains rules for data governance that can be configured by the DBA. By default, the user will be notified once their data has been in the EDW for thirty days. They are required to “re-up” on that data otherwise it wil be automatically deleted.
Elastic marts are fundamentally free, but can enable your business users to discover valuable and new insights from your data. The Elastic Marts Builder portlet can be downloaded for free. The overall amount of data allocated to the Elastic Marts can be controlled by the DBA, but this space which is essentially a sunk cost, since you have already provisioned it for your data warehouse. In most cases, there are no additional ETL scripts or jobs.
The Teradata Agile Analytic Cloud helps control data mart proliferation. If users don’t have an agile analytic cloud, they will hide data marts on shadow systems throughout the organization, duplicating data to those systems, which ultimately becomes out of data or obsolete. If they embrace the analytic cloud instead, the DBA will know where all the data marts are, monitor their use, and be able to persuade the user to discard the ones that outlived their usefulness. Or if true business value is found, they can work with the business users to promote them into the EDW.
Where there is lasting business value, promote the mart into the EDW via the BI governance team. Promoting an elastic mart into the EDW often causes changes to the physical data model, new ETL jobs, and new data quality certifications.
Elastic marts should not be accessible by more than one user. The demand to share information must be balanced against the likelihood that users will proliferate permanent data marts. Letting two or more people access an elastic mart almost guarantees the mart will become entrenched as a permanent mart, a trap that’s hard to escape. Yield when you must but when the 3rd analyst starts asking for access, it’s a trap.
Similarly, elastic marts should be temporary. The DBA can set up a period for expiration when each elastic mart should be deleted or promoted into the EDW. Users need to be respect they cannot bypass all rigor and governance indefinitely.
If users are not allowed to use Insert(Select) or load utilities, you can guarantee that they rely primarily on refined data in the EDW. This prevents them from making copy after copy of EDW data and consuming all the perm space. They can work around these controls by downloading reports into CSV files then uploading, but the burden on the PC and the user inhibits this approach.
Correct workload management will be essential. The elastic mart workload is unknown and varies from the trivial to the unconstrained Cartesian product query. The user will make mistakes operating outside the normal query environment using unrefined data. It is good idea to constrain the overall elastic marts workload appropriately.
Always remember the agile environment needs minimalist controls. Emphasize allowing all business teams the freedom of choice to use any data and any tools. Let the users create, load, and drop their own tables but do it with minimum DBA support. Bypass rigorous rules in favor of innovation and collaboration.
First, users should be encouraged to explore, “fail fast”, recover, and explore some more. Like exploring a mysterious island, users who make mistakes are mapping the landscape, locating both snakes and golden nuggets. Intimate interaction with new data includes making mistakes. It also includes raw un-cleansed and poorly modeled data. This creates a need for writing SQL directly. Often the SQL is overly complex to compensate for the unrefined data.
Agile Analytics can allow your business users to quickly explore and discover new insights that can benefit your business. By enabling this type of analytics on your EDW system you can avoid costly data mart proliferation and data duplication, while taking advantage of a basically free resource. The Teradata Agile Analytics Cloud is comprised of software and methodologies that provide a best-in-class cloud platform for enabling agile analytics for your users. Start discovering today!
Finally, please note that while Teradata does not provide official support for the Elastic Marts Builder portlet, you can however talk to other users and get community help in the Cloud Computing forum.