Data Distribution

Distributed Database Techniques

Larry Wakeman

 


Preface

The dream of distributed database technology has been in existence for a long time without the possibility of implementation occurring.  The goal of this dream was to locate data through out an enterprise where it was needed without excessive duplication of data.  In theory, this would include having multiple database instances acting as a complete database.  Tables could be located in a specific instance in the enterprise or it could be split up over several instances by either having specific rows in specific instances or having a portion of the rows in one instance while the rest of the rows are in other instances.  While all of this is going on, referential integrity is maintained and the ability to join data from these different databases in a single query is accomplished.

The closest that commercial products have been able to achieve is data replication.  In this technology, multiple databases have copies of the data.  Each database in the enterprise is logically independent of the others but through the replication process has data from the other databases.  Data replication places limits on how the data can be manipulated in that only the database that creates a row can modify that row.  All other databases in the enterprise can only query that row and not modify it.  This limitation does not make data replication useless.  Common uses of this technique would include a roll-up database for enterprise wide reporting as well as the distribution of common data such as price lists to all of the databases that require this information.

This text describes a method that may be employed to provide a distributed database environment.  This technique allows data to be made available at any location in the enterprise and to make that data modifiable at all locations as well as providing a mechanism to propagate those updates throughout the enterprise.  A goal of this approach is to achieve these goals without modifying the applications that are running on the databases.

This work is derived from a project that I participated in at Applied Materials (AMAT) many years ago. As the Architect of the data distribution piece of the “Seldomly Connected Architecture”, I had to deal with many of the problems that I discuss in this book.  I also worked for Nortel Networks in the Clarify Division which had it’s own distributed database technology called “Traveler”.  The methods that I am discussing here are not necessarily the same methods that we used at AMAT and at Nortel.

Why Data Distribution

Why implement a Distributed Database?

There are several reasons to implement a distributed database environment such as:

Data Locality

Data locality is the ability to place data near the users. In large enterprises, the work force is located in many locations around the world.  In my latest application area, Customer Relationship Management (CRM), these enterprises could have Call Centers in many locations around the world.  If they had to connect to a database located at a central site, some of the users would suffer from long delay times due to telecommunication delays. Providing a local database for the users, the performance of the system is improved for the remote locations and because of this performance improvement customer satisfaction is greatly improved.

I know you are thinking that a database replication method as provided by many database vendors could achieve the same means. This is true as long as you live with a serious restriction that these replication products have which is that they only support a single write instance of a database row.  This means that if a trouble call is logged in Sydney, it can’t be updated outside of Sydney.  In the CRM space, this will cause two common practices to be prevented, Follow-The-Sun (FTS) support and Escalation. In FTS, if a call is logged in Sydney, when the Sydney office closes, the call is moved to another Call Center such as Geneva for continued support. With Escalation, when the Call Center personnel determine that they can’t resolve the issue, they will escalate the call to a higher level of support or maybe to the engineering staff.

Mobile Workforce

Many people in these large enterprises spend most of their time away from an office where they could run an application and perform their work.  The salesperson may spend most of his time at customer sites, taking orders, providing technical data and other tasks.  It is not always
feasible to connect a laptop to a phone line and dial into the corporate network to enter orders or literature requests.  Even if this was possible, the telecommunications latency could cause serious performance problems.  With a database in the laptop, the salesperson can enter orders and literature requests into his database and at a later time, dial into the network and send these items to the other databases in the networks. Also imagine if the salesperson has an up to date price list on the laptop.  Quotes can be created and edited in an interactive fashion and printed locally.

The example of the Field Service Technician is also interesting in this example.  The technician can update the database on the laptop and have the trouble calls that they are to work on the laptop.  As these call are worked, the tech can update the calls, order parts and close the calls.  At a later time, these call updates will be updated in the other databases.

Data Subsets

One issue with the use of laptops is that they usually don’t have the capacity to hold a complete application database. In this instance a means to provide only the useful information from the database will make it possible to have a copy of the application running on the laptop.  Limiting the data to the customers that the laptop user supports could do this. It could also be done by limiting the reference data such as parts lists to those of interest to the user.  If a salesperson sells printers and supplies, they would not be interested in information on systems and peripherals.

Another reason for data subsets is data security.  At AMAT, the application was installed at the customer sites. These customers didn’t want other customers to be able to see any information on.  This can be achieved by using data subsets.

Data Distribution Summary

How is Distribution Accomplished

Architecture

The techniques described in this book have some specific architectural requirements.  The first is that there is a central database that is the collection point and arbiter for all updates as well as being the distribution point for database updates.  Updates to the other databases in the enterprise are considered provisional until the updated record is returned to the remote database from the central database. Conversely, any update that is made to the central database can override any pending updates from the remote databases.


Figure 1
Data Distribution Architecture

The database update processes occur at discrete times and are not continuous.  They are instituted by the remote databases and not by the central database. This will allow mobile databases to not be continuously connected to the network.

Any update that is applied to the central database must not overwrite another update.  This means that the row that is to be updated must have been the row that is currently in the central database.  The methodology for doing this will be discussed further in the section on Central Database Updates.

All rows in the database will have a row version. This will be discussed in the section on Central Database Updates.

Remote databases will connect to the central database using a guaranteed delivery protocol such as TCP. The process will be performed in a series of synchronized steps. These steps are:

Each of these steps will be discussed following the section on Data Update Detection.

Data Update Detection

This part of the process is the only part that will actually be done by the applications.  There are two basic methods for detecting updates, marking the rows with an update flag or writing the updates to an update table. If the database supports triggers, this can be done by the database itself.  If the database doesn’t support triggers, the application will have to be modified to perform this function. 

There are two methods for using a database update table. The first would be to create a table that contains columns for a table id or table name, primary key value, the date/time of the update and the database
transaction id of the update transaction.  The second method would be to have the trigger or application code create a row in the update table that has columns for the table id or table name, primary key value, date/time of the update, the transaction id and the update to be sent to the central database.

Data Update Detection must also be performed in the central database.  If update marking is performed, the marking is done in the central database by updating the row version.  If a separate update table is maintained, the row version is still updated and the row version is added to the update table.

Data Update Collection

This is the first step of the synchronization process. The synchronization process will query the database to collect the updates that have been made since the last synchronization and place them in a file to be submitted to the central database. If updates are collected in an update table, the process reads this table
and writes the formatted updates to a file. If only the key values of the updated records are stored, this would also involve the reading of the data records and formatting the results.

If updated records were marked with an update flag, the process would have to read all of the required tables in the database to find the rows that have been updated.  Care must be taken to avoid attempting to order these updates so that any referential constraints in the database are not violated. One method of doing this would be to flag the updated rows with the transaction id of the update transaction when the rows are updated. Another method that has been used is to order the table scan so that updates to required data is read before the data that is required is read. This would require that the process know what order the tables are to be scanned.

Other data should also be placed in the update file. The schema version of the database should be provided. This is to insure that the data is compatible with the other databases in the enterprise.  It is hoped that the application that uses the database has a method for reading the schema version from the database that I have found to be the rule. If the application supports customization of the schema, the user-defined portion of the schema version must be maintained in a consistent manner.

The update should also indicate what the latest update was to the local database.  This is to create an update that will synchronize the local database with the central database without having row updates sent that were already submitted.

Update Submission

This is the second step in the synchronization process. This step involves sending the update to the central database server and then notifying the central database server that the update is there.

Using ftp to send the data to the central database server and then connecting to the server through a known socket and sending a message with the filename of the update could do this. It is advised that the enterprise use the trusted host facility of TCP/IP to insure that only those systems that are part of the enterprise be allowed to connect to the ftp directory and the communications socket. The filename should be made up of the hostname and an update identifier from the remote system to insure that the remote systems will not overwrite their own or other remote systems updates.

Central Database Updates

The update process on the central database server opens the specified update file and verifies that the schema version of the remote database is compatible with the schema version of the central database. If this is not the case, the update process will notify the remote system and close the connection.  The remote synchronization process should alert its manager that the database schema versions are incompatible.  In this event, the manager should determine if any application and database updates are required.

The update process will now apply the updates to the central database.  The updates will only be completed if the primary key value and the row version of the update record and the record in the database are the same. Note that these updates will create updates in the database update table or mark the rows as being updated.  The updated rows will also be provided with a new row version.

Completed Update Collection

The update process will now use the last update information from the remote database update file and collect the pertinent rows from the database.  As with the remote database update collection, this is done by reading the database update table, the database update table and the updated data from the database or by scanning the database.

Update Reception

The file is now sent to the remote database system using either a known directory or by reading the location from the socket or the update file.  When transmission is complete, the remote system is notified through the socket that the update is available and what the file name is.  The socket can now be closed.

Remote Data Update

The synchronization process applies the updates to the remote database.  When this is done, the update table in the database is cleared of the updates that were sent to the central database.

Collision Detection

A collision would occur when to databases attempt to update the same row in the database.  The principle that is applied is that the first update to the central database will win and super-cede later updates to that row if they don’t have the proper row version.  The alternative is called Last in Wins.  The problem with last in wins is that old data can be applied to the central database losing valid updates.  Consider the case of the laptop user.  An FST is working on a trouble ticket and goes on vacation but doesn’t run the synchronization process on the laptop.  Another FST takes over the call, resolves the issue and closes the trouble ticket. 
The first FST returns from vacation, performs the synchronization process and likely will overwrite the second FST’s data. The trouble ticket could be re-opened in the database.

When the update is applied to the central database, the row version of the row submitted by the remote database must be the same as the row version in the central database. Two databases that synchronized at similar times would have the same row version in their database. The first update to the central database would update the row version when it is applied and the second update would fail. The user that applied the failed update should be notified that the update failed and be provided with a mechanism to reapply the update.

Collision detection can occur at two places in the process. The first would be at the central database update.  If an update fails this would be detected by the number of rows updated being zero, the update process could format a collision record for the completed update file or when the updates are applied to the remote database, the updates can be checked to determine if there was a collision. The update from the central database would be applied to the remote database and the update record would be flagged as being a collision or the original record would be moved to a collision table.

Delete Processing

Fortunately, most applications do not allow the deletion of data from the database.  If deletes are allowed, the processing to support deletes can be very tricky.

The first concern is what to do if the delete fails. If the application does not perform cascaded deletes, then the row that would have been deleted would be reapplied to the remote database. This should be considered a collision and has to be detected but the remote database would be complete.

The real problem occurs when cascaded deletes are supported.  In this situation, the user deletes a row from the database and all of the related rows are deleted. In the CRM space a trouble ticket would likely have several status change notes, user notes and other rows that are related back to the trouble call. If a user deletes the trouble call, the cascading delete will also delete these other records.  In this situation, the update file will contain a delete. When the central database is updated, this delete will delete the trouble call and all of the related records.  Consider if a user on another remote database updated one of the related records. 
When the parent is deleted, that update is deleted which violates the principle of First in Wins.

The solution is to record the primary key and row version values of the parent and related rows in the remote database and verify that all these rows exist at the specified row version and that there are not other related rows in the central database.  If this is not the case, the remote database has to be notified that the delete failed and the data restored in the database. This requires that the rows either be marked as having been deleted without actually deleting them or saving the data in a pending delete table. Marking the rows as having been deleted would be easier for the update processing, but the application may have to be modified to not display these records.

The other concern is when a remote database updates a record.  The update in the central database will fail with no records updated or a referential integrity constraint violation.  The remote database should then receive the deletes in it’s next update file from the central database but also needs to know that the update failed so that the offending row can be deleted and a collision detected.

Network Topology

The following figure illustrates the network topology of the distributed data environment in the enterprise.


Figure Data Distribution Architecture

The techniques described in this text require that a Central Database exists as the arbiter and final authority for all updates to all of the databases. If the network were made up of peer systems, each transmitting updates to other systems, controlling the propagation of updates would be difficult and the process would be fraught with errors.


Figure Peer Network Example

One scenario is illustrated in the above figure. In this scenario, each system is responsible for transmitting updates to the other systems. As long as this happens in a timely manner, everything is fine. There is a window during which an error could be introduced. Assume that both System A and System B update the same row in their local databases and prepare update transmission. System A transmits it’s updates to the other systems. Before that update can be applied on System B, it sends out it’s update and then the update from System A in processed. System A and B have different contents in the updated row and Systems C and D have indeterminate contents based on the order they receive the updates. This window can be aggravated if either System A or B is inaccessible while the updating process occurs. Since this is the technique that is used by Data Distributors, it can be seen why they limit row updating to the system that created the row.

This illustrates the need for a central arbiter to determine which update is the correct one and to insure that the correct data is stored in all of the databases in the enterprise. This brings us to some of the precepts that have to be determined before any hope of implementing a distributed database can be achieved.

Distribution Precepts

The basic precepts we will discuss are:

Row Versions

A row version is a mechanism to assign a version value to a row in the database. One of the methods is to have an integer value that is set to one when the row is created in the central database and incremented when the row is updated.

A similar method is to maintain a monotonically increasing counter in the database and to assign its value to the row value in the central database when the row is created or updated. The advantage that this method has is that the updates can be ordered when they are prepared for transmission. The disadvantage of this method is that there are a finite number of updates that can be applied before the row version counter overflows. With a unsigned long integer, this value is 4,294,967,296.

Another method would be to store the date/time of the update. The DATE datatype in most databases stores the number of seconds from the database epoch date, a date the developers agreed would be prior to any date that may need to be stored in a database. If the date/time value is used as the row version, care must be taken to insure that the system clock on the central database server is consistent, possibly by synchronizing the system time to a government maintained standard.

Primary Key Assignments

Provision must be made to prevent primary key collisions. There are two basic techniques that can be used to solve this problem, Allocated Primary Key ranges or distribution of Primary Key Pools.

Allocating Primary Key ranges is to basically divide the primary key values based on some database identifier. Clarify used this method before the inception of Traveler by placing a numeric database id in the high order bits of the long integer primary keys. The assigned primary key value would be (Database ID * 2^28) + next primary key value. The method that was used at AMAT was to have two longword values comprise the primary key, putting the database id in the first longword value and the generated value in the second. This can feasibly be done with character primary key values by pre-pending a database identifier to the front the assigned primary key value.

Traveler used a method that distributed Primary Key Pools to the remote databases. Each database was initially allocated two pools for each table in the database and when a pool was exhausted, the remote database would request a new pool.

Collisions and Collision Resolution

A collision is when a single logical row is updated by two or more databases at virtually the same time. One of these updates must be determined to be the correct update and the others discarded by the central database. A simple collision resolution method would be to require that the row version of the row in the central database be the same as the row version in the update row from the remote database. If the values aren’t equal, the update is thrown away. This would establish that the first update received by the central database would be the one applied, known as “First in Wins.”

Closed Loop Update Process

The data in the central database is always correct. For a remote database to complete an update to a row, that update is sent to the central database, it is applied and then when updates are returned from the central database, the update is included, confirming the update in the remote database. This does not mean that the updated data must make the round trip before it can be seen in the remote database. If every update had to be applied and returned by the central database, performance would be abysmal. Updates are applied to the remote databases as they occur but the synchronization process can overwrite them when the loop is closed.

Collision Detection

The user must be notified that a row that they modified was not applied to the database because of a collision.

This can be done by trapping the collision on the central database server and notifying the remote database server that the collision occurred or by detecting that the row received from the central database was not the same row that was sent.

The mechanics of this process will be discussed later.

Remote Database Row States

We will now examine some of the basics of the distributed database synchronization process. We will look at the states that database rows can be in as well as the state of updates that are submitted to the central database.

Remote Database Row States

While rows in the remote databases can be in any of several states, the rows in the central database can only be in a single state. Data in the central database is always the correct data. Updates to the remote databases do not become correct until the update has be sent to the central database server, updated in the central database and returned to the remote database and applied.


Row State Diagram

The states that a row in the remote database can be are:

Update States

The following diagram describes the states and the state transitions for the remote database synchronization process.


Update State Diagram

The states that the synchronous process can be in are:

Detection Methodologies

Detection Methodologies

Update Capture

Updates to both the remote and central databases must be detected so that they can be collected and sent to the other databases.  This section does not deal with the deletion of rows from the database.  That subject will be handled in the Delete Processing section.  The methods that can be used include row marking and update tables.

A single row table called DIST$CONTROL_TABLE that has a column named MASTER_DB and UPDATE_TX_ID is used to control the update capture.

Control Table Creation and loading

If MASTER_DB is TRUE, the database that the triggers are running on is the central database.  UPDATE_TX_ID is used to control processing in the triggers such as

Row Marking

The row marking technique would add two columns to distributed tables.  The first column would be a row status field that would contain a status indicator for the row update state and the second would contain the transaction id for either the last or first update of the row.  In the remote databases, the transaction id field would be cleared when the row is updated from the central database.

The following is the SQL that could be used to add these columns to a table as well as a row version column:

Row Marking Alter Table Command - Oracle

Update Tables

An alternative method would be to have an update table.  This could be in one of two forms, a table that would hold pointers to the updated rows or a table that contains formatted update information.  Both of these tables would have columns for the table name or identifier, primary key value, row status and transaction id.  If formatted data is to be included, a column would be included for that data.

The following SQL are examples of how to create this table.  The first example points to the actual data that has been updated.

Update Table Creation – Oracle

The following example is for an update table that includes the update information formatted for transmission.

Update Table with Update Data Creation – Oracle

Trigger Based Detection

If feasible, the actual detection of updates should be done in a database trigger.  The trigger code would either update the row status and transaction id columns of the update row or would populate the update.  The following is an example of trigger code for row marking for an update.

Row Marking Trigger Update– Oracle

The following is an example of trigger code for row marking for an insert.

Row Marking Trigger Insert– Oracle

The following is an example of trigger code using an update table for an update.

Update Table Trigger Update– Oracle

The following is an example of trigger code using an update table for an insert.

Update Table Trigger Insert– Oracle

The following is an example of trigger code using a formatted update table for an update.

Formatted Update Table Trigger Update– Oracle

The following is an example of trigger code using a formatted update table for an insert.

Formatted Update Table Trigger Insert– Oracle

These triggers rely on a few stored procedures.  LOCAL_TRANSACTION_ID is in the Oracle provided package DBMS_TRANSACTION.  The procedures DIST$NEW_ STATUS and DIST$NEXT_ROW_VERSION would be created for the application.  The following is an example of the DIST$NEW_STATUS function:

DIST$NEW_VERSION Stored Procedure – Oracle

The following examples are for the function DIST$NEXT_ROW_VERSION.  You will note that these examples return the old row version except if the database is the central database.  In the first example, the current row version is incremented.

DIST$NEXT_ROW_VERSION Stored Procedure - Oracle

The following is an example of the same function that returns the next value from a sequence called DITS$ROW_VERSION.

DIST$NEXT_ROW_VERSION Stored Procedure – Oracle

The following example of DIST$NEXT_ROW_VERSION returns the current date for new row version.

DIST$NEXT_ROW_VERSION Stored Procedure – Oracle

Application Based Detection

Application based detection can be used if the database that is being used doesn’t support triggers and stored procedures and the application is customizable.  At AMAT, we were running a heterogeneous environment with the remote sites using a Microsoft Access database and the central database being an Oracle database.  Though we were able to use triggers on the central database, we had to modify the application to support the remote databases.

The same database customizations will need to be performed to the database that was performed for trigger-based detection.  If the database does not support the reading of the transaction id, a column will have to be added to the DIST$CONTROL_TABLE as follows:

Control Table Creation and loading

The above example also has a column for CURRENT_ROW_VERSION.  This would be needed if the ROW_VERSION stored in the database is generated from a pseudo sequence.

If row marking is to be implemented, the easiest way to mark the row is to add hidden controls linked to the DIST$ROW_STATUS, DIST$UPDATE_TX_ID and ROW_VERSION columns.  The following is an Access Basic example of how to perform this action:

Row Marking - Access Basic

The following example implements an Update table.

Update Table - Access Basic

The following example would be used if the Update table contains the preformatted data.

Formatted Update Table - Access Basic

As with triggers, these routines require some global functions, gblGetNewRowVersion, gblGetNewRowStatus, gblGetRowStatus and gblGetNewTxnID.  The following is an example of how to implement gblGetNewRowVersion.

Get New Row Status Function - Access Basic

The following is an example to implement the function gblGetRowStatus.

Get Row Status Function - Access Basic

The following example implements the gblGetNewRowVersion function that just increments the current row version.

Get New Row Version - Access Basic

The following example implements the gblGetNewRowVersion function that uses a pseudo-counter in the control table..

Get New Row Version - Access Basic

The following example implements the gblGetNewRowVersion function that uses the current date.

Get New Row Version - Access Basic

The following example is an implementation of the gblGetNewTxnID function.

Get New Transaction ID - Access Basic

The following example implants the gblIsMaster function.

Is Master - Access Basic

Handling Calculated Fields

Some fields in the database can hold calculated values or summaries from other tables.  An example would be in a time logging function, each associated time entry would be added to a total time field on a parent record.  A Trouble Notes field in a incidence record could be the accumulation of the text from attached notes entries.  If we ignore these fields when we capture updates to the database and have them recalculated in the other databases, it would save on the number of updates that we would have to send to the master database and return to the remote databases.

Ignoring these fields in an application modification example is trivial as these fields should not be updateable by the user and we wouldn’t get a save event unless the user modified another column in the row.  Since the values will be recalculated anyway, if the row is updated, we can ignore the column.

If triggers are being used, they will have to determine if another column on the table has been updated before taking action.  The following is an example of the Row Marking Update trigger modified to check if a column is updated that needs to be marked.  Note that inserts don’t need this checking as they will need to be marked in any case.

Row Marking Trigger Checking For Update - Oracle