Integrating Isolation Levels into Transactional Federations
Data consistency and transaction atomicity are key requirements of advanced applications in federated systems like the Internet that consist of distributed and often highly heterogeneous components. Existing and upcoming E-service applications like electronic auctions and mobile commerce, but also the federation of business databases when two companies collaborate or merge, are not conceivable without the underlying transactional mechanisms. Providing atomicity and isolation is much harder in federated system than in a homogeneous, centrally administered distributed database system. Among the reasons for this are the different concurrency control protocols used by the component databases and the invisibility of internal processing details to software at the federated level.
Recently many algorithms for federated transaction management have been proposed in the literature. However, only techniques for federated atomicity have been adopted by real products. Virtually no existing federated database system includes algorithms for federated concurrency control. The reason is that the solutions proposed in the literature impose hard constraints on the databases in the federation that most real database products do not meet. In particular, they completely ignore the frequent use of relaxed correctness criteria, the so-called isolation levels. Isolation Levels allow increasing the performance of many applications at the cost of less rigorous guarantees for data consistency. Even though such mechanisms are widely used in existing systems, the scientific literature has mostly ignored them.
As a central part of my research activity, I design algorithms for federated concurrency control that provably guarantee the correct execution of federated transactions, even when some of the involved database systems guarantee only restricted isolation levels. I focus on systems supporting Snapshot Isolation. This level is highly relevant for real-world applications as it is the highest degree that Oracle, the market leader, can guarantee for transactions. I compared several proposed definitions of Isolation Levels. For Snapshot Isolation, I developed a new graph-based characterization that can be applied to guarantee a serializable execution on top of Snapshot Isolation. Based on these formal definitions I reconsider existing algorithms for federated concurrency control and extend them so that they can guarantee lobally correct execution even when some component systems support only Snapshot Isolation. I focus on the family of ticket techniques by Georgakopoulos et. al. and multilevel transactions by Weikum et. al. From the characterization of Snapshot Isolation I also derived a new algorithm for federated concurrency control.
Analogously to the local component systems one may use Isolation Levels for federated transactions to increase the potential performance for applications, if they can cope with the effects of the lower degree of consistency. I proved that the isolation levels from ANSI SQL weaker than serializability automatically hold for federated transactions if they are guaranteed in all component systems. To ensure federated Snapshot Isolation, which is most interesting in Oracle-only federations, additional actions at the federated layer are necessary. I proposed two different algorithms to guarantee Snapshot Isolation for federated transactions.
The algorithms I design are integrated into a prototype system which was developed based on the federated database system VHDBS. VHDBS was designed and implemented by Fraunhofer ISST in Dortmund, it integrates heterogeneous database systems based on Wiederhold's wrapper-mediator paradigm, including the object-relational system Oracle and the object-oriented system O2. For both internal and external communication, VHDBS makes use of Orbix, Iona's implementation of the CORBA standard. In a research project funded by the Deutsche Telekom AG I have integrated a transaction manager, coined TraFIC, into the VHDBS core. TraFIC offers a suite of strategies to guarantee atomicity and serializability for federated transactions even if some component systems support only Snapshot Isolation. The administrator of the VHDBS system may choose the strategy that promises the best performance for a given application environment; additionally, there are strategies that automatically aim to choose the (performance-wise) best strategy among a subset of the available strategies. TraFIC has an extensible architecture that allows easy integration of newly developed strategies. Like VHDBS, TraFIC is based on CORBA and uses OrbixOTS, Iona's implementation of CORBA's Object Transaction Service, to guarantee that federated transactions are executed atomically.
The developed prototype serves as a test bed for the experimental evaluation of the proposed algorithms. I developed a suite of benchmark programs for systematic performance comparison of the different strategies. The results up to now indicate the practical viability of the newly developed techniques.