Oracle Replication
For Decision Support At IU

6/1999, Mike Riley



Introduction
Overview
Applications
Setup & Implementation
Evaluation

Introduction

This document takes a quick gander at Oracle data replication, including a high-level overview, possible applications in Datawarehousing at IU, setup & implementation, and finally a high-level evaluation of "goodness of fit".

Oracle provides several mechanisms to support replication between Oracle Database sites. These mechanisms provide replication for database objects such as tables, views, database triggers, packages, indexes, and even synonyms. This document addresses only the mechanisms that support data replication: Multi-master Replication (MMR), and Materialized Views. Materialized Views are covered in detail since they apply most directly to Data-warehousing applications. Although Oracle's Procedural Replication can be used to support data replication, it is only briefly mentioned here due to its complex, low-level nature.

Hyperlinks in this document link to online documentation on Oracle's Tech-Net web site at: http://technet.oracle.com, which requires a registered username and password. You may obtain a username and password by selecting the "Membership" link from the home page of the site.

The material in this document was gleaned from the following sources:

  1. Database Replication (Oracle8i Concepts manual, chap 34),
  2. Data Warehousing with Materialized Views (Oracle8i Tuning, manual, chap 28),
  3. Snapshot Concepts & Architecture, (Oracle8i Replication manual, chap 3), and
  4. Directly Create Snapshot Environment , (Oracle8i Replication manual, chap 5),
  5. Replication API Reference manual.
Other Replication references:

Overview

Multi-master Replication

Oracle's Multi-master Replication (MMR) allows multiple sites, acting as equal peers, to maintain replicated "master" copies of database objects. Each participating site maintains a complete updateable copy of the replicated object (defined in a "Replication Group" - discussed below), where replication among the participating sites may be specified as either Synchronous, or Non-Synchronous. Synchronous Replication updates are performed within the bounds of the originating transaction via. Two-phase commits. Non-Synchronous replication updates are propagated sometime after the originating transaction, and are termed "deferred transactions". Mechanisms exist for handling conflicts that may occur when using Non-Synchronous, multi-site updates.

Because it is not permissible, nor desirable to allow updates to base-table OLTP data from within a Data-warehouse (DW), and because DW applications often require only subsets of the base-table data, MMR is not a good fit for general usage in a DW architecture. For this reason, this type of replication is not discussed in any further detail here.

Materialized Views

MVs provide a distributed, point-in-time copy, of one or more remote "master" tables, and may include row restrictions, column selections, aggregations, and sub-selections. MVs support updateable and/or read-only copies that may be based on Synchronous or Non-Synchronous replication. In addition, Non-Synchronous replication may be scheduled, or on-demand. Concerning replication, MVs are useful for:

...the first two of which are of particular interest for DW applications.

The term MV is used synonymously with the term Snapshot in Oracle documentation. Indeed, even the PL/SQL command: "create materialized view..." can be used interchangeably with the command: "create snapshot...". Oracle has apparently chosen to replace the term Snapshot with MV in all future documentation releases (see "Note" in Replication Objects, Groups, and Sites, Oracle8i Tuning manual, chap 28).

Also, note that since MVs are updated through an efficient batch process, they are more efficient than MMR.

Updateable MVs

Always based on a single "master" table, updateable MVs may be defined as all or a subset of the base-table's rows and columns. Although there must be a one-to-one correspondence between the updateable MV and the table it is based upon, SQL subselects may be used in the definition of the MV to qualify the participating rows.

 Read-only MVs

Read-only MVs can support simple or complex views of the "master" table(s) on which it is defined. A complex MV is one that contains: aggregates, joins, set operations (e.g. sum, count), or a CONNECT BY clause.

Procedural Replication

Oracle's Procedural Replication replicates simple stored procedure calls to distributed sites, and can be used for data replication. Essentially, a simple Remote Procedure Call, Procedural Replication is a "programming focused" solution. It should be noted, though, that Procedural Replication might be useful for minimizing network data transfer if the data (or even a large portion of it) used by the replicated procedure call is pre-available at the replication site (probably a rare circumstance in reality, but possible, I suppose).

 Replication Tools

Oracle provides the following tools for managing replication objects:

MVs - More than Replication

Besides replication, MVs are also useful for pre-calculated joins and aggregations, even within the same Oracle database (although, this is not a requirement). Similar to traditional indexes, Oracle's query-optimizer automatically (transparently) rewrites a query against base tables to use a MV where possible, avoiding the overhead time required to perform the actual join or aggregation in realtime, thus significantly reducing query response time. Such MVs are defined by the DBA, and then created and maintained by Oracle. MVs may be queried directly, but Oracle recommends against doing so to allow the DBA freedom to drop/create/change MVs without affecting the related [query] applications - much like an index. Any number of joins as well as aggregates may be used; the underlying disadvantage is the cpu time and disk-space required to maintain the MV.

Materialized Views Classes

There are three classes of MVs in Oracle: Rowid, Primary Key, and Complex. Rowid MVs are provided only for backward compatibility with Oracle7, are based on a physical row identifier assigned internally by Oracle, and are not recommended for Oracle8+ implmentations. Primary Key MVs are the usual (default) type of MV. Oracle maintains Primary Key MVs by logging the primary-key of each row affected by all DML statements against the base table upon which the MV is defined, and then at some later point-in-time, using the log to replicate the affected rows to the MV.

A MV is classed as Complex if it contains: an aggregate (count, sum,...), a distinct clause, a set operation (UNION, INTERSECT, MINUS,...), or a connect-by clause, or if it does not meet certain specific subquery restrictions (see Table 3-1, "Restrictions for Snapshots with Subqueries"). It is most important to note that complex MVs cannot be incrementally refreshed ("fast refreshed") by Oracle, but rather must be completely refreshed (rebuilt) when a new "point-in-time-view" is required (/triggered).

Examples:

  1. A simple updateable Primary Key MV:

  2. CREATE MATERIALIZED VIEW sales.customers FOR UPDATE AS
      SELECT * FROM sales.customers@dbs1.acme.com;
  3. A fast-refresh (i.e. incremental) Primary Key MV based on sub-selects:

  4. CREATE MATERIALIZED VIEW sales.orders REFRESH FAST AS
        SELECT * FROM sales.orders@dbs1.acme.com o
        WHERE EXISTS
             (SELECT 1 FROM sales.customers@dbs1.acme.com c
             WHERE o.c_id = c.c_id AND zip = 19555);
     
  5. A Rowid MV:

  6. CREATE MATERIALIZED VIEW sales.customers REFRESH WITH ROWID AS
      SELECT * FROM sales.customers@dbs1.acme.com;
         
  7. A Complex MV:

  8. CREATE MATERIALIZED VIEW scott.snap_employees AS
        SELECT emp.empno, emp.ename FROM scott.emp@dbs1.acme.com
        UNION ALL
        SELECT new_emp.empno, new_emp.ename FROM scott.new_emp@dbs1.acme.com;
Primary Key MV Architecture

An Oracle Replication architecture consists of: a base-table, a trigger, and an update log, at the "master site", and a MV, a trigger, and update log at the snapshot site. A Readonly MV does not have/need a log or trigger at the "snapshot site", and the log and trigger at the Master site are only needed if fast (incremental) refresh is employed. See Figure 3-5 Snapshot ReplicationMechanisms for a pictorial view.

As mentioned above, Oracle maintains Primary Key MVs by logging the primary-key of the affected row of the table upon which the MV is defined. This logging is performed by an internal trigger on the master base-table that fires each time a row is added, updated, or deleted; the Primary Key of the affected row, and any "filter columns" are written to a log file named MLOG$_master_table_name. If the MV is updateable, an internal trigger is defined on it also, and the primary key and filter column of the affected row is written to a corresponding local log file named USLOG$_Materialized_View_Name.

A MV has at least one index, I_SNAP$_Materialized_View_Name, corresponding to the Primary Key. Additional indexes may be created to support fast refreshes that involve subqueries.

Certain subqueries of updateable MVs require "filter columns", in addition to the Primary Key columns of the base table.  See "Using Filter Columns" for more information.

Some datatypes, such as LONG, BFILES, and user defined datatypes are not supported by MVs, and there are some restrictions on the use of LOBs. See Datatype considerations.

Replication Group Mechanisms

An Oracle Replication Group (RG) construct facilitates the administration of a defined set of replication objects. Typically, RGs are used to group together objects by application area. A RG may have objects from several DB schemas, and a DB schema may have objects in several RGs, but a replication object can be a member of only one group. While MMR and updateable MV objects must participate in an RG, Read-only MVs are not required to, but often do (by choice) for administrative convenience.

An RG at a site used for MVs ( Snapshots) is referred to in the Oracle Replication manual as a Snapshot Group, and a MMR RG is referred to as a Master Group. While an MMR site must contain all objects that participate in the Master Group (see Fig 34-1, Oracle8I Replication manual), a Snapshot (MV) site may contain all or a subset of the objects that participate in a Snapshot Group (see Fig 34-2, Oracle8I Replication manual). Multiple Snapshot Groups can be defined on a target Master Group.

Refresh Groups can be used to keep a set of replication objects synchronized to a specific transactionally-consistent point-in-time. preserving referential data integrity. MVs from multiple Snapshot Groups can participate in a Refresh Group (see Figure 3-8, Oracle8I Replication manual). Replication performance efficiencies can be gained by grouping multiple replication objects into a single Refresh Group; up to 400 MVs can be contained in a single Refresh Group. Using muliple Refresh Groups for a given RG can introduce referential inconsistencies.

Refresh Mechanisms

Three different methods are employed by Oracle for refreshing MVs: Complete Refresh, Fast Refresh, Force Refresh. Complete Refresh executes the complete defining query of the MV to rebuild it. Fast Refresh replicates only newly affected rows, using the primary key stored in the update log; updateable MVs logs are next processed and applied to the master table, and then afterwards, when the master table update log is processed, updates are pushed from there to other/all dependent MVs. Force Refresh attempts a Fast Refresh first, and if unsuccessful, performs a Complete Refresh. A refresh may be scheduled via. a specific "refresh interval", or performed on-demand.


Applications

Listed below are possible applications of Oracle Replication to Datawarehousing at IU:
  1. Continuously replicated copies of OLTP tables in the DSS environment (e.g. to replace the current Sybase replication used in the Budget & Student Systems).
  2. Nightly, incrementally refreshed copies of OLTP tables in the DSS environment (e.g. to incrementally refresh large datagroups).
  3. Nightly, fully refreshed copies of OLTP tables in the DSS environment (e.g. to replace current Sybase-to-Sybase "bcp out/in" type of functionality).
  4. Automatic datagroup (re)builds.

Setup & Implementation


The following steps are required to setup Replication:

  1. Modify the ora.init file at each maser/snapshot site (see Before You Start, Replication API manual),
  2. Set up the master and/or snapshot site(s) for replication,
  3. Create the "master" and/or snapshot replication group(s),
  4. Create a Deployment Template (for on-demand, disconnected MVs).
Using PL/SQL

A nice, simple flowchart depicting these steps with hyper-links to sub-diagrams, their substeps, etc., is given in Figure 1-1. The lowest level sub-steps in the example charts are hyper-linked to example PL/SQL statements, including example calls to the Replication API. Chapter 7, "Manage Replicated Environment with APIs", in the Replication API manual, contains a good description of the tasks required to manage a replication environment. See also Chapter 8, "Replication Management API Reference", Chapter 9, "Data Dictionary Views", and "Advanced Management of Master and Snapshot Groups".

Going GUI

The following sections of the Oracle8I Replication manual provide an overview of how to use GUI-based Replication Manager to accomplish these same tasks: Prepare for Snapshots, Create a Snapshot Log, Create Snapshot Environment.


Evaluation

This section is an attempt to characterize Oracle Replication's "goodness of fit" for the pontential applications outlined above.  First, general Pros and Cons of Oracle Replication are listed, and then "Application-Specific" comments follow.  All evaluative comments address only read-only Materialized View replication (see the comments in the "Overview" section relative to other Oracle replication mechanisms).
General Pros:
General Cons:
Application-Specific Analysis:
  1. Application:  Continuously replicated copies of OLTP tables in the DSS environment.

  2. Implemented by:  Read-only Materialized Views, "Fast Refreshed" continuously (or frequently).
    Pros: Cons: Comments:   May be a viable option if used in a limited way, i.e. for replicating a limited number of tables that incur a relatively low volume of transactions.  Initial (or "occassional") complete refresh of large tables may be expensive (note: one user reported avoiding "very costly" complete refreshes by resync'g replicated tables using an undocumented procedure!).
     
  3. Application: Nightly, incrementally refreshed copies of OLTP tables in the DSS environment.

  4. Implemented by:  read-only, Materialized Views, "Fast Refreshed" each night.
    Pros: Cons: Comments:   Same as above.
     
  5. Application:  Nightly, full refreshed copies of OLTP tables in the DSS environment

  6. Implemented by:  read-only, Materialized Views, "Complete Refreshed" each night.
    Pros: Cons: Comments:  May be a viable option if used in a limited way, i.e. for replicating a limited number of tables that are relatively small in size.  It is currently unkown whether or not it is more costly than using Oracle Export/Import (my guess is that export/import are much more efficient).
     
  7. Application:  Automatic datagroup (re)builds.

  8. Mechanism:  read-only, "Complex" Materialized Views, "Complete Refreshed" on demand.
    Pros: Cons: Comments:  May be a viable option if used in a limited way, i.e. for replicating a limited number of tables that are relatively small in size.
Unknowns: