Metadata implementation with Ab Initio EME

Tools
Tools covers the tools and utilities you use to work with Teradata and its supporting ecosystem. You'll find information on everything from the Teradata Eclipse plug-in to load/extract tools.
Teradata Employee

Metadata implementation with Ab Initio EME

Metadata is a much over-looked requirement of data warehousing, primarily since it's not straightforward; though this difficulty can be eased by use of appropriate tools, such as Ab Initio's EME. The following article looks at the various types of metadata, and the requirements and objectives each meets, then considers some benefits of, and methods for, using a) a metadata tool and b) EME, specifically.

Metadata are critical to a successful data warehouse, constituting the repository for technical and business information about the warehouse.

  • Business metadata includes descriptive information about the warehouse such as full names and synonyms for warehouse attributes and tables, business definitions, and the calculations used for derived attributes. Server-based business metadata allows this information to be accessed with any data access tool.
  • Technical metadata includes load statistics such as the last date the tables were updated, known data quality problems, performance statistics, etc. Technical metadata primarily serves the data warehouse support staff.

Metadata has three primary major goals:

Define:

Manage the components necessary to provide an understanding of the DW and the enterprise through its content.

Build and Administer:

Manage the components necessary to efficiently define, build and operate the DW. DW metadata includes the structure and contents of the operational data that is used to execute DW jobs and perform the audit, balance and control aspects of the DW environment. This goal requires well-defined metadata about both data and processes.

Navigate:

Manage the components necessary to enable self-service use of the DW that is efficient, accurate and repeatable.

Metadata Framework

The diagram below represents how and end-to-end (E2E) solution should integrate and present data, in the most usable way. Best practice is to store and view metadata in one place.

End-to-end implementation of the above metadata framework can be achieved using the Ab Initio EME / EMMS solution.

This solution integrates both technical and business metadata and enables you to grasp the entirety of your data processing — from operations to analytical systems.

Meta model implementation

The standard Ab Initio EME meta model needs to be expanded and created in the Teradata database; that way we can have one central data store for all metadata in the DWH platform.

An example metadata model to support this is shown below.

The base schema version of the Ab Initio EME needs to be expanded based on the above model, in order to support all business requirement of the DW platform.

The diagram below shows the base schema that is provided with Ab Initio EME.

In general terms the meta model should be able to provide the following information:

  • Who is using my data
  • How my data are used
  • What is happening to my data
  • Who has changed them
  • How I can use them

Additionally, best practice requires that the meta model should have a wider scope than the current DW, since DW projects can grow rapidly.

Lineage

Extending the EME Base model will provide the opportunity to leverage all technologies, and create dependencies between them. At this stage Teradata see the following sources of metadata information:

  • Ab Initio
  • Teradata RDBMS
  • Teradata SQL
  • Cognos or other BI Tools
  • ErWin (Logical Data Model)
  • Source Target Matrix

All of the above sources will have to be loaded and stored in the metadata repository, with all dependencies and relationships to each other. Additionally Ab Initio can parse Teradata SQL or BTEQ scripts automatically when they are executed using the “Execute SQL” component, which allows the use of Ab Initio as a primary execution engine for an 'extract, transform, load and transform' (ETLT) approach. Using EME, this does not require any manual process in terms of metadata integration, as they will be parsed automatically by the Ab Initio component.

This capability of Ab Initio will allow automatic maintenance and understanding of Teradata SQL jobs, in terms of what is happening with the data.

Dependency analysis is supported through SQL SELECT statements back to the original source field; a dataset object representing the original source table must be present in the data store. In order to perform dependency analysis on SELECT statements, the EME must capture a snapshot of the database schema for each of the tables that are referenced.

For example, consider the following SQL statement:

SELECT *

FROM customers, orders

WHERE customers.id = orders.customer_id;

To perform dependency analysis for this query, the EME has to know the names and data types for all columns in the CUSTOMERS and ORDERS tables. The GDE captures this information when the output DML for the SELECT statement is generated. For situations in which metadata was imported from the database through a connector graph, the development team will have to create dataset definitions for database tables. This can be done by use of the appropriate connector graph in the $AB_HOME/connectors directory of the GDE Component Organizer. You must set up each connector graph so that it loads tables and columns. Loading Keys and Indexes is optional. If you load Keys, they will be profiled when you run a profile job using the EME objects that you generate from the dataset definition. Note, however, that the Data Profiler ignores Indexes.

When the graph is analysed, this information is used to create table datasets in the data store catalogue. Analysis of the SQL statement by the EME then allows the GDE to link columns in the table datasets with fields in the output record format of the INPUT TABLE component.

Other sources:

Additionally we should load the following data from their sources, to support the end-to-end metadata approach

  • ERWin Communication Logical Data Model.
  • Source Target Matrix - with all business rules that will be stored as a result of prototyping and mapping interviews.
  • All ETL metadata coming from Ab Initio

This portfolio of information will provide all data that are needed to support data lineage requirements.

Ab Initio EME use

Ab Initio EME can be logically divided into:

  • Data Integration portion
  • User interface (access to the metadata information)

The DW project will have to define an Ab Initio EME extended schema which will be stored in the Ab Initio system, but since a particular implementation may be using several different technologies for ETLT, Business Intelligence and Business Glossary, the physical database representation will need to be maintained as well. This is why, in the above section, Teradata is proposing two expansions of the metadata schemas: Ab Initio base, plus additional expansion.

These two models will need to be synchronised, so that installations can leverage the reporting capabilities of the EME solution.

Ab Initio EME user access (Desktop Portal and Web Interface)

The Ab Initio EME Web Interface is a user application for browsing, editing, and mapping metadata in an EME data store. This interface is suitable for metadata architects and others responsible for managing metadata within DW organisation (for example data stewards).

The image below gives an example of how the web interface can present metadata related to database objects:

The EME Desktop Portal provides all the functionality of the EME Web Interface to Enterprise Metadata (using different interface), as well as the following additional futures:

  • A Table of Contents pane that gives users better control over browsing metadata and makes it easier to find specific objects in a datastore.
  • A Mapping Editor that allows users to easily define links between objects.

The following diagram illustrates how users interact with an EME data store’s metadata:

1 REPLY
Teradata Employee

Re: Metadata implementation with Ab Initio EME

Very nice article, Mike. Have you had an opportunity to compare Ab Initio's EME to Teradata Meta Data Services (MDS)? -Steve Long, Teradata Product Manager