Unity Director has a wide set of capabilities to control where users and requests are routed. Unity Director 15.00 now offers so many choices and ways to control things; they can be a little confusing. Let's talk about some of the most common uses for the available routing options (and some cool ones you might not have considered yet).
First, to make things a little easier for you, Unity Director 15.00 now has a set of pre-defined routing rules with names that indicate what they are used for:
Unity Director 15.00 introduces a new routing mode called 'Passive routing'. If you're familiar with the older Query Director product, Passive routing closely matches the capabilities Query Director provided. Passive routing is ideal for reporting workloads that primarily read data and don’t require data synchronization. Passive routing can be used to route users to specific systems, or load balance across them. Since passive routing has extremely low-latency it’s particularly useful for fast tactical queries to local Teradata systems. Passive routing is also very simple and easy to administer. Passive sessions can automatically route around system-level outages, but are designed to ignore table-level outages. This allows the DBA to better control over when reporting workloads (or users) are redirected off a system. This fact can be used to do some clever things, like reporting from a table on one system while loading to it on another, as I explain below.
Managed routing is the right choice for workloads that do require data synchronization, like ELT/ETL processes, or workloads that require guaranteed read-consistency across systems, like some online OLTP applications. All work done in a managed session is fully recoverable. This means it will be automatically replayed on any system in the event of a system outage. This protects against loss of data on a system during both planned or unplanned outages. Managed routing sessions will automatically route around table-level issues, such as a table that has been marked interrupted or unrecoverable because of a data inconsistency or request failure.
Managed sessions also provide a guaranteed transactionally consistent view of all Teradata systems, so if a read could have been affected by a preceding outstanding write, it will only be executed on the system where that write has been completed. Managed sessions will also automatically route individual requests around tables that have been taken out-of-service. Managed sessions do have greater response time overhead than passive routing sessions since they are intended for data synchronization and need to do more work. For normal DML requests, this fixed latency is very low. For some DDL requests, it can be higher.
In Unity Director, user mappings are used to pick routing rules. A user mapping is a pattern that is used to match incoming user connections with a routing rule. User mappings can match:
It’s important to note that all of these rely on information passed from the client to the unity server at connect time. A user can’t, for example, rely on the default account string or database profile from inside the Teradata system to select a routing rule, because the routing rule is chosen before they connect to the Teradata system to gain access to that information.
It's a best practice to consider how you are going to select routing rules ahead of time when setting up service accounts for applications. Ideally different applications should use different accounts, so they can be controlled independently and easily redirected. It’s also a best practice to have applications specify an account string on the client connect string for the same reason. This provides one more way to separate different clients and pick routing rules for them. For more on this topic, see my other article on phased rollouts of applications. http://forums.teradata.com/uda/articles/phased-rollouts-in-unity-director-loader
Workload balancing is one of the most straightforward and powerful benefits of Unity Director that everyone can benefit from. Unity Director now offers four ways to divide work across systems, two using Passive routing, and two using Managed routing. All of these options allow you to make better use of your existing hardware and reduce the burden on a single system from heavy reporting workloads.
Passive routing can divide reporting workloads across systems by selecting a system based on which ever system is the least-used at the moment. This is a simple and efficient way to divide the number of reporting sessions across all systems and make the best use of existing hardware.
Passive routing can also divide reporting workloads across systems by alternating sessions to each system. This is a simple and efficient way to divide the number of reporting sessions across all systems in an equal manner. This is a good idea if you have two or more systems that are essentially identical and are located in the same data center.
If you’re executing a reporting workload that relies on volatile or temporary tables and are using managed routing because have a requirement for guaranteed data consistency (E.g. consistent reads, not ACCESS reads), you can use the CREATE BALANCED or CREATE PREFERRED option to split the workload across two or more systems.
By default, any time you do a read using the standard default or ETL routing rule, the read will automatically be sent to the system that has the least number of outstanding requests. This allows reads from workloads that require access to synchronized data to be balanced across all available systems.
Managed sessions not only route user sessions to a Teradata system; they also can route individual requests within a session. This is done automatically based on the availability of tables on the system. But you can also control this routing by adjusting the placement of tables or views, so some exist on both systems, while others exist only on one system. I’ve written about this feature before as a way to automatically route requests to historical data (http://developer.teradata.com/uda/articles/accessing-historical-and-current-data-with-unity-director) .
This is an impressive capability, but also one that requires some operational experience to use effectively. This is because in order to process a write request, a user needs the correct routing rule to reach all the systems where the table is located. If a user can’t reach all the copies of a table when they are trying to write to it, they will see the error:
4511 - A mismatch has occurred in this request between where the object(s) exist and where they need to exist for the transaction or session. Check the object(s) used and the systems they are managed on.
Another common operational error that sometimes occurs is accidentally assigning a load user to a reporting routing rule. When this happens, it’s normal to see the load process fail with 4511 errors, because it can only access one of the systems, when it needs access to all the systems to write to tables. To correct these issues, it’s important to not only correct the routing rule, but to also correct the placement of any error and log tables dynamically created by the load job.
Closely related to this topic, it's important to plan out how users that load data are going to interact with sets of tables that exist only on one system, and others that exist on both. Users that attempt to load data via managed sessions will sometimes have trouble switching between these. The reason for this trouble is that the managed session routing rule controls where the error and log tables are created. It can specify that new tables are created on both systems (default) or one system or the other (with CREATE BALANCED/PREFFERED). Since load utilities do dynamically create tables, it's important that the routing rule creates these tables where the target table being loaded exists. Consequently, with a single user, you can’t switch between loading a table on a table that exists on a single system using a load utility (like fastload or Multiload) and loading a different table that exists on both without changing the routing rule the user is mapped to. If you don’t switch the rules, a 4511 error usually results.
One of my favorite uses for routing is to use it to access the Teradata system that is nearest to the client. This is an ideal setup for users that are doing tactical reads that require a very fast response time and have multiple Teradata systems spread across wide geographical regions. By accessing the system nearest to them using a passive routing session, they can avoid all latency involved with cross site communication. Passive routing does not require any communication between unity servers in order to process requests, so this provides a very fast way to send requests to the local system, while still providing the ability to work off other systems during an outage.
Using a region based user-mapping is the natural choice for this type of configuration. This allows you to easily select a PREFERRED routing rule for the local system based on which Unity server a client has connected to.
For passive routing rules PREFERRED rules use the ‘First available’ option, since they will connect to the systems in a specified order and take a connection from the first system that is available. You can change the order of the systems in your rule definition so that the search for first available starts at a different system.
If you have more than two data centers, you can even use a third or fourth unity server to act as an additional gateway in those data centers. This allows each unity server to be configured to access the local teradata system, while protecting against an outage in each of the regions.
Another cool trick that passive routing enables is allowing reporting users to access a table while it's being loaded using fastload, multiload or another bulk load utility. This trick relies on the fact that there are two or more copies of the table and Unity Director/Loader has the ability hold off writes on one copy while the other is being loaded. As long as reporting users do not require fully transactional read consistency, they can use this technic to maintain access to their data while the load operation is going on. Here are the steps:
Unity Director's routing features provide you with a set of powerful tools that can be used to tackle almost any multisystem challenge. How will you use them?