Accessing Historical and Current Data with Unity Director

Unity
Unity is Teradata’s data synchronization and workload routing tool providing Active – Active database availability delivering near real-time RTO/RPO
Teradata Employee

Accessing Historical and Current Data with Unity Director

Unity Director is an extremely capable product that offers a wide variety of benefits. One of it's unique benefits is the ability to easily route users and requests to specific Teradata systems. There are many potential uses of this ability; but one use, in particular, is to selectively direct users that want access to historical data to specific systems where the data resides. 


Unity Director 14.10 (target release Q1 2014) will introduce full functionality for different reference depths on tables (or what is called "Table-depth"), allowing for users to define views to access different ranges of data (i.e. historical data) on particular systems.  Here's an example to put this into context - customers will be able to have three years of history on Sales table in System "A", and three months of history on the same Sales table in System "B".  Retaining this data within the same table can simplify ETL and reporting processes. As such, Unity Director will provide an easy way to retain historical data on a single (typically larger) Teradata system, while allowing for non-historical or more recent data to remain on a different or potentially smaller system.   

However, you don't need to wait until Unity Director 14.10 to start using this functionality. There is a way to leverage different table-depths in the current release of the product.  Here’s an example of how you can start creating system specific views today with Unity Director 14.00.

Creating the history and current data views

The follow example illustrates how to create a set of views, a currentYearView to display data from the last year only, and a historyView to display data from all the previous years. While the data supporting the currentYearView is kept on both systems (system 1 & system 2), the data for the historyView is retained only on system 2.

1.       The base data table, which will contain the current and historical records, is created on both systems:

create table basedata (id integer, ts timestamp);

2.       On both systems, create a current view (currentYearView) that will return only current data (in this case the rows from the last year).

create view currentYearView as
locking basedata for access
select * from basedata where ts between '2013-01-01 00:00:00' and '2013-12-31 23:59:59

3.       An additional table, called system2Only is also created only on system 2.This will help direct queries to the historical data on system 2:

create table system2only (id integer); -- And Empty table, with arbitrary columns

4.       Also on system 2, a history view (historyView) is created that joins to the system2Only table:

create view historyView as
locking basedata for access
select * from basedata
full outer join system2only on 1=0;

5.       Using the Unity Director Dictionary Scanner, create a dictionary for the database. Be sure to include the system2Only table, and historyView on system 2. Since the historyView depends on the system2Only table, it will automatically be available only on system 2. This is important, because Unity Director 14.00 will not otherwise allow the view to be selectively managed on a specific system.

 

6.       Historical data (records for 10 years in this case) should be populated directly on system 2:

insert into basedata values (1, CAST('20030503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (2, CAST('20040503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (3, CAST('20050503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (4, CAST('20060503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (5, CAST('20070503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (6, CAST('20080503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (7, CAST('20090503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (8, CAST('2010503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (9, CAST('20110503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
insert into basedata values (10, CAST('20120503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));

select count(*) from basedata;

*** Query completed. One row found. One column returned.
*** Total elapsed time was 1 second.

Count(*)
-----------
10

 

7.       The data for the current year can now be populated via Unity Director, using the currentYearView:

insert into currentYearView (11, CAST('20130503111111' AS TIMESTAMP(6) FORMAT 'YYYYMMDDhhmiss'));
*** Insert completed. One row added.
*** Total elapsed time was 1 second.

select count(*) from currentYearView;
*** Query completed. One row found. One column returned.
*** Total elapsed time was 1 second.

Count(*)
-----------
1

 

8.       Selects to the history view will automatically route to System 2, and see the full data, because the system2only is only present on system 2:

select * from historyView order by ts;

*** Query completed. 10 rows found. 2 columns returned.
*** Total elapsed time was 1 second.

id ts
----------- --------------------------
1 2003-05-03 11:11:11.000000
2 2004-05-03 11:11:11.000000
3 2005-05-03 11:11:11.000000
4 2006-05-03 11:11:11.000000
5 2007-05-03 11:11:11.000000
6 2008-05-03 11:11:11.000000
7 2009-05-03 11:11:11.000000
9 2011-05-03 11:11:11.000000
10 2012-05-03 11:11:11.000000
11 2013-05-03 11:11:11.000000

Daily Modifying and Loading/Unloading of Current Data

Inserts of new current data should be done through Unity Director using the current data view (currentYearView) as shown in step 6 of the previous example. Unity Director will multicast these inserts to both Teradata systems.

Using the current data view reduces the risk that a user will unintentionally modify historical data that exists only on the second system, since the historical data is not available via this view:

delete currentYearView where id=2;

*** Delete completed. No rows removed.
*** Total elapsed time was 1 second.

Updates or deletes of current data can also be executed via Unity Director, similarly using the current data view to prevent touching historical data.

delete currentYearView where id=11;

*** Delete completed. One row removed.
*** Total elapsed time was 1 second.

In the event a user attempts to modify historical records through Unity Director that only exist in the second system, Unity Director will detect this as a data inconsistency and make one of the copies of the table offline.

Consider this ill-advised update to the base data table from a rogue user, accessing the basedata table directly:

update basedata set id=id;

*** Update completed. One row changed.
*** Total elapsed time was 1 second.

Since this update to the base data table affected 1 row on system 1, and 10 rows on system two, Unity Director detects the data inconsistency and takes one copy of the table offline, generating the following alert (some text removed for brevity - note the differing row counts):

Alert No.          : 99
Alert Code : 40054
Alert Description: An inconsistent response was detected from a Teradata system. As a result the object has been placed in the Unrecoverable state.
Alert Details : System 2 (db2) sent an inconsistent response, table dbtest.basedata is unrecoverable.

Response mismatch: session 1000, txn 376, request 13, log 11421536

SQL: 'update basedata set id=id;'

Response 1 from system 1 (db1): row count 1, hash 3010, mesh status 0, DRCP code 0 [sent to client]

Response 2 from system 2 (db2): row count 10, hash 30a0, mesh status 0, DRCP code 0

System 2 (db2) sent an inconsistent response, table dbtest.basedata is unrecoverable.

Which copy of the table (system 1 with only current data, or system 2 with the historical data) is taken off-line is not-deterministic. Since Unity Director is designed to protect the user against these data inconsistencies, the rogue user sees only 1 row affected, since system 1 was the first to respond in this case.

Unloading, Archiving or Modifying Historical Data

Since the historical data cannot be modified via Unity Director or Loader, ETL processes that modify that data must access the data directly on the second system. As with any process directly accessing a Unity Director managed table on a single Teradata system, these direct ETL processes must be managed appropriately to not interfere with any work load accessing the base table via Unity Director or Unity Loader. The base table should be halted in Unity Director before attempting to load or unload any historical data, and then recovered to return it to active service when the ETL process is complete.

unityadmin> object halt dbtest.basedata on db2;

The request is currently processing as operation number 25.
You may check its status using the command 'operation check 25'.

unityadmin> operation check 25;

Operation Number : 25
Operation Name : Halting Table
User : admin
User Name : Main Administration User
Progress (%) : 100
Status : Finished (1)
Start Time : 05/14 07:13:17
Finish Time : 05/14 07:13:18
Systems:

[2] db2 - Finished (1)

Updates:

05/14 07:13:18 [-] Info: Halting table dbtest.basedata
05/14 07:13:18 [-] Info: Requesting mgmt X lock on 'dbtest.basedata'
05/14 07:13:18 [-] Info: Mgmt X lock on 'dbtest.basedata' granted
05/14 07:13:18 [-] Info: Releasing mgmt X lock on 'dbtest.basedata'
05/14 07:13:18 [-] Info: Successfully halted table on 1 systems
05/14 07:13:18 [-] Info: Operation finished

unityadmin>

It's important to note that during this operation, the historical data will be unavailable to client applications via Unity Director. Clients attempting to read from the historical view will automatically hold until the table is returned to service.

 

BTEQ -- Enter your SQL request or BTEQ command:

select * from historyView order by ts;

…client application waits until the base table is returned to service, since it requires both the system2Only table, and the basedata table.

Once the base table is recovered:

unityadmin> object recover  dbtest.basedata on db2;

The request is currently processing as operation number 26.

…then the client request completes:

select * from historyView order by ts;

*** Query completed. 10 rows found. 2 columns returned.
*** Total elapsed time was 3 minutes and 49 seconds.

id ts
----------- --------------------------
1 2003-05-03 14:11:11.000000
2 2004-05-03 14:11:11.000000
3 2005-05-03 14:11:11.000000
4 2006-05-03 14:11:11.000000
5 2007-05-03 14:11:11.000000
6 2008-05-03 14:11:11.000000
7 2009-05-03 14:11:11.000000
9 2011-05-03 14:11:11.000000
10 2012-05-03 14:11:11.000000
11 2013-05-03 14:11:11.000000

BTEQ -- Enter your SQL request or BTEQ command:

Availability Considerations

As shown in the previous example, if the base data table is unavailable for any reason on the second system, the historical data will be unavailable. If the table is out-of-service or standby, any reads against the historyView will hold until it is returned to server.

Should the base data table become unrecoverable on the second system, then any queries against the historyView will fail with the following error:

select * from historyView order by ts;

*** Failure 4510 No systems available
*** Total elapsed time was 1 second.

Unity Director will also generate an alert when this occurs:

----------------------------------------------

Alert No.          : 98

Alert Code         : 40044

Alert Description  : There were no systems available to complete the request.

Alert Details      : No target found for session 1000

Alert Category     : Database Operations

Resource Type      : System

Resource ID        : u14s2

Alert Severity     : Critical

Alert State        : Opened

Repeated           : 0

Raised Time        : 05/14 07:30:29

Future Changes in Unity Director 14.10

The use of this system2Only table is only a temporary work around, necessary for the 14.00 version of Unity Director. Unity Director 14.10 will introduce the ability to select which systems views are managed on, eliminating the need for this table. This will eliminate the need to add the locking modifier for the historyView:

create view historyView as
locking basedata for access
select * from basedata

 

Instead, the Unity Director's Data Dictionary will explicitly allow the view to be managed on specific systems, without the dependency on the system2Only table:

 

Additional Resources

8 REPLIES
Enthusiast

Re: Accessing Historical and Current Data with Unity Director

Hi Paul,

I heard that, Unity Director does not currently integrate with TASM rules, and can encounter issues when some TASM rules are in use. Is this limitation lifted in upcoming versions?

And for ex: Assume on a prodcution box, there is a TPT utility throttle rule on utility sessions which allows only 20 sessions at a time and delays the 21st one for the batch jobs. In this case, assume there are almost 19 utility sessions are in use for the batch loads, only 1 is left over. And at the same time, if there is a Unity synchronization going on via TPTLOAD opertor (which need may be 2 or 3 tptload utility sessoins) to sync some data, in that case do we experiecne an impact on Database sessoins side for the batch job OR unity will wait and do not process (sync) any data until those 19 utility sessions are freed up?

And from your article i read that "During recovery, Unity Director will also attempt to open as many sessions as possible on a Teradata system. This may result in the number of TASM sessions being exceeded during recovery."

Can you please clarify what you mean by recovery, is that Unity recovery? and why it happens. I want to unerstand this becuase we should be prepare to tweak the TASm rules temporarily to faciliate the recovery.

Thanks,

Geeta.

Enthusiast

Re: Accessing Historical and Current Data with Unity Director

So Unity Director  syncs up tables in both systems only if both tables have same names?

Can periodical purge from partitioned tables be done? or if not possible , we need to write script ?

Teradata Employee

Re: Accessing Historical and Current Data with Unity Director

Hello Paul,

In Unity Director, assuming a Unity Server (Unity_A) manages 02 TD Systems (TD_1 & TD_2). When a new object (Table_1) is created through Unity, it appears in both TD_1 & TD_2. Same happens when another new object (Table_2) is created through Unity. Any Write request by default is redirected to all TD Systems managed by the Unity Director.

Initially, both these tables (Table_1 & Table_2) are assigned to "Unassigned" in Data Dictionary tab of Unity Configuration portlet. In "Unassigned", any Write operation through Unity Server is applied on both TD_1 & TD_2. When 01 Table (Table_1) is assigned to the Deployed Data Dictionary (DD_1) & the other table is allowed to remain in the "Unassigned", any Write requests is redirected to both the TD systems. Meaning, irrespective of object assignment to the deployed Data Dictionary (DD_1), any Write Requests are redirected to all the TD Systems containing the affected objects. This provided the Routing Rules allows Write to both TD Systems & Create [Preferred|Balanced] isn't used.

If the above is correct, when the Manual says "Unity Director defines the Teradata Database objects it manages in Unity Data Dictionary definitions" @ Page 25 (Unity Director User Guide 14.10) doesn't make sense as the Unity Director still changes the state of Table_1 & Table_2 from Active to Unrecoverable State when I bypass any DML operation directly to TD_1 or TD_2 in addition to redirecting any Write Request to both Systems as explained above. 

My Question is two-fold. Any explanation for the above behavior of Data Dictionary will be great. Also, if I continue to add new objects through Unity Director (Thereby creating on both TD Systems) without explicitly assigning the Deployed Data Dictionary to each of them, what is the consequence in the long term ? Because, as explained in the 3rd para in my query, Unity Director still manages the state of both Table_1 & Table_2 on both system.

Thanks in Advance,

Smarak

Teradata Employee

Re: Accessing Historical and Current Data with Unity Director

Yes, you are correct, all tables created via Unity Director will be created on all managed Teradata systems the user is routed to. Creating objects via Unity Director is generally a best practice, and saves the extra work of activating them that is required when you first create them directly on the Teradata systems, and then add them via the Data Dictionary scanner.

Tables created via Unity Director are added to the Data Dictionary, in the "Unassigned" Dictionary. You can relocate them later - or not. Data Dictionaries help you organize your deployed objects, but leaving them in the 'Unassigned' Dictionary has no long term consequnces.

Yes, Unity does still manage the state of the table. If you make DML changes directly on one system, and they cause a response mismatch with a different number of rows effected, it will cause one copy to become unrecoverable.

Teradata Employee

Re: Accessing Historical and Current Data with Unity Director

Hi Raja,

Sorry, I didn't see your question until now.

>So Unity Director  syncs up tables in both systems only if both tables have same names?

Yes, Unity Director syncs up tables and requires the names of both copies to be exactly the same.

>Can periodical purge from partitioned tables be done? or if not possible , we need to write script ?



Yes. If you wanted to purge historical data only from one system, you would (carefully) write a script that would run directly against only one of the teradata systems. You would need to be cautious not to change the set of data that was common across both systems that was being accessed via Unity.

Teradata Employee

Re: Accessing Historical and Current Data with Unity Director

I'm happy to point out that setting up a set of tables has now been simplified in Unity Director 14.10. You can now accomplish the same thing this article did, simply by creating a history view that exists only on one system, and then selecting to manage that view only on the system it exists on.

For a complete example, check out the upcoming Teradata Unity Director and Loader 14.11 Best Practices Orange Book. It should be out next month (DEC/14).

Teradata Employee

Re: Accessing Historical and Current Data with Unity Director

Hello Paul,

Thanks for your response. I would summarize your response & add few ascepts for clarification:

Considering that Director will manage the objects in "Unassigned" Data Dictionary, will it be correct if I say that Data Dictionary comes into play while selecting the Databases which Unity Director should manage. Because if a Database is selected for management by Director, all the objects created within that Database through Unity Director will be managed irrespective of their DD assignment. Yet, during DD Scan, any Object(Table,View etc) de-selected within a managed Database will NOT be managed, returning an Error "Object Not Found".

Thanks,

Smarak

Teradata Employee

Re: Accessing Historical and Current Data with Unity Director

Paul, I have 1 question each about CREATE Balanced & CREATE Preferred. 

(a) Question About CREATE Preferred:

Read A,B Write A,B Create Preferred = All Create Requests routed to A with failover to B.

For a 03 statements Transaction {CT T1, CT T2, SELECT FROM T1 & T2}, Ideal Scenario:

CT T1 = Goes to "A"

CT T2= Goes to "A" 

SELECT FROM T1 & T2=Goes to "A" (As underlying objects exists in A only)

If "A" fails after CT T1, then CT T2 goes to "B". Then, SELECT FROM T1 & T2 will never be successful as T1 & T2 doesn't exist in atleast 1 Systems listed in the Read routing rule. So, if 1 System FAILS during a CREATE Preferred Mechanism, then any READ Request will NEVER be successful. If my statement is true, is there any suggestion on how to avoid this besides not using the Preferred Routing Mechanism. 

(b) Question About CREATE Balanced:

For one Session, all Create Requests will be directed to only 1 System. Because, if the same transaction is submitted as above, then T1 is created @ A & T2 is created @ B with Read failing always. So, if 1 Session has 100 Create Requests with Balanced Routing, then all these 100 Create Statements will be created on say "A" with the next session having 10 Create Requests will be directed to "B" (Instead of "A" in case of Preferred) to ensure balanced environment. So, Create Balanced ~ Create Session Balanced as Sessions are balanced rather than Create Statements are balanced. Am I correct in my understanding of Balanced Routing Mechanism for Create.

Thanks in Advance,

Smarak