What are the problems before GST

12 Federal database systems

12.5 Transaction Management

The support of global change transactions in FDBS raises major problems with regard to transaction management if no expansion of the LDBS is accepted. According to the two-part architecture of FDBS (Fig. 12-1), transaction management also takes place in two stages: in addition to local transaction management in each LDBS, there is global transaction management in the FDBS additional layer (Fig. 12-4). Global transactions (GT) are broken down by the global transaction management into a sequence of locally executable sub-transactions (GST). With full autonomy, the LDBS do not know any difference between local transactions (LT) and GST. Furthermore, there should be no cooperation between the LDBS and the global transaction management or between the LDBS. Finally, there can be heterogeneity in the local transaction administrations, especially in terms of synchronization (e.g. different locking procedures or an optimistic synchronization can be used).

Under these conditions, it is extremely difficult to maintain the ACID properties for global transactions. The automatic support of global integrity conditions (property C) requires their specification by the GDBA so that they can in principle be monitored by the FDBS when processing global change operations. The properties A, I and D require the guarantee of global serializability and the implementation of a correct commit protocol between the nodes participating in the federation. The implementation of a global commit protocol becomes almost impossible because the LDBS does not distinguish between local transactions and global sub-transactions. For this reason, an LDBS already commits this (sub) transaction at the end of a GST, so that its changes are permanent and visible at this point in time. Resetting this sub-transaction, as is necessary when the global transaction fails, is therefore no longer possible in principle. The global transaction management can at most try to reverse the changes to the sub-transaction by means of a so-called compensation transaction [SW91]. However, this cannot prevent the local visibility of changes that are invalid from a global perspective.

The decoupling of global and local transaction management causes similarly serious problems for synchronization. In order to maintain global serializability, conflicts with local transactions in particular cause difficulties, since these are unknown to global transaction management. In the example in Fig. 12-5, from a global point of view, no violation of serializability can be seen, since only the object a is referenced by both global transactions. Since GT1 first accesses a, there is the dependency (the conflict)

GT1 2.

There is also no violation of local serializability in either LDBS. However, the accesses of the local transaction LT3 to the indirect (transitive) dependency
GT2 3 1

between the two global transactions, so there is a cyclical dependency between GT1 and GT2so that global serializability is violated. However, this is not recognized because the global transaction management does not notice the transitive dependency.

Solving the problems mentioned requires compromising on one or more of the following areas:

  • Functional restrictions, e.g. regarding support for global change transactions.
  • Reduction of the LDBS autonomy, in particular in the form of extensions to be made to the local transaction management.
  • Limitation of heterogeneity, e.g. use of identical synchronization and commit protocols.
  • Reduction of the ACID assertions, e.g. waiver of global serializability.

It is clear that these points can be taken into account in a variety of ways. Simple "approaches" consist in not allowing any global change transactions or restricting them to a changing sub-transaction (Section 11.3.1). For the general case with any number of changing sub-transactions, the one in Chap. The approach to transaction management discussed in 11.3.1 has the greatest importance in practice. It stipulates that all LDBS support a strict two-phase locking protocol and a timeout approach for handling global deadlocks. In addition, the LDBS must participate in a global commit protocol (via the XA interface). With these relatively small LDBS extensions, the properties A, I and D are ensured for global transactions. Furthermore, there is also a simple implementation for the global transaction management, since this is essentially limited to the implementation of the global commit protocol. Further advantages are that, apart from the commit protocol, there are no additional messages for global transaction management and that each node can function as a global commit coordinator (distributed implementation of global transaction management). The restrictions that have to be accepted are reduced autonomy (LDBS extensions, dependency on the global commit coordinator) as well as a large absence of heterogeneity in local transaction management.

Research in recent years has suggested a variety of other approaches to transaction management in FDBS that seek a higher degree of autonomy and heterogeneity. The practical suitability of many proposals is questionable, however, since they are often based on dubious assumptions; in addition, hardly any of the proposed procedures were actually implemented. In many cases, a centralized global transaction manager was assumed through which all global transactions must be carried out. This is an unacceptable solution, especially for geographically distributed systems for reasons of performance and autonomy. In addition, it was often assumed that the global transaction manager knows exactly the object references (generally records or pages) of global transactions for synchronization. However, even this cannot be assumed in the case of decoupled LDBS, since the exact object references generally differ. result only when processing in the LDBS. Other approaches impose (due to pessimistic conflicting assumptions) great restrictions on the parallelism between global transactions, so that the performance is severely impaired. Finally, a number of papers only look at the synchronization problem, so the closely related commit handling remains unsolved. In the following, we want to discuss two suggestions in more detail, namely commit order and quasi-serializability. [VG93] provides an overview of further approaches.

Commit order [Raz92, Raz93]

The principle of Commitment Ordering (CO) is a general approach to guarantee serializability, which can advantageously be transferred to federal DBS. A transaction schedule fulfills the CO property, if the commit operations of conflicting transactions are carried out in the same order as the conflict operations themselves. In [Raz92] it was shown that the fulfillment of the CO property is a sufficient condition for serializability. The CO property is apparently always guaranteed with a strict two-phase locking protocol, but this can also be achieved for other, deadlock-free (e.g. optimistic) synchronization methods.

Example 12-4

From the two schedules
. . . w1 (x), r2 (x), c1, c2
and
. . . w1 (x), r2 (x), c2, c1
only the first fulfills the CO property, since the commit operations (ci) of the two conflicting transactions T1 and T2 are performed in the same order as their conflicting operations on object x. The second schedule can be serialized (T.1 2), although the CO property is not fulfilled. This shows that the CO condition is not a necessary prerequisite for serializability.
The second schedule is not possible in the case of a blocking procedure, since T2 a blocking conflict with T1 arises, which only occurs after T1 (Lock release) is resolved (c2 must therefore after c1 come). In the case of a non-blocking, CO-compliant synchronization protocol, operation c2 recognized that a violation of the CO property threatens. This can be done e.g. by canceling T1 be prevented.
For federative DBS it is now easy to ensure global serializability, provided that each LDBS uses a synchronization method that guarantees the CO property with regard to the (sub) transactions carried out on it (this condition alone is obviously not sufficient, since local serializability no global serializability guaranteed in each node). For this it is sufficient that each LDBS participates in a global commit protocol (e.g. a distributed two-phase commit protocol) so that the LDBS accept an external commit decision. During the local commit handling of a transaction T, all transactions that should have ended before T due to local conflicts must be aborted. In the example in Fig. 12-5, this would result in the global commit of GT1 (after exiting GST12) in LDBS2 be recognized that GST22 has not yet ended, although this transaction is in the local CO order before GST12 stands. By canceling GST22 (and thus from GT2) the problem is resolved and global serializability is preserved.

This protocol apparently represents a generalization of the approach discussed above, in which, in addition to participating in a global commit protocol, it was assumed that each LDBS uses a strict two-phase locking protocol. However, the outlined CO approach allows heterogeneity in the local synchronization component as long as the CO property is maintained. In particular, deadlock-free synchronization methods can be supported. In turn, autonomy restrictions due to the global commit protocol and (possibly) extensions of the LDBS to guarantee the CO property must be accepted.

When checking the CO property, no distinction is made between local transactions and global sub-transactions in the basic procedure. However, if a distinction between these two transaction types is possible in the LDBS, this can be used to limit the CO check to global transactions, provided the local synchronization protocol guarantees local serializability [Raz93]. With this extended protocol, a reduced abort frequency for local transactions can be expected.

Quasi-serializability

A number of research studies attempted to facilitate transaction management in FDBS by supporting weaker correctness criteria than global serializability. One such criterion is quasi-serializability [DE89, DEK91]. Quasi-serializability requires that all local schedules are serializable and that the execution of global transactions is equivalent to a quasi-serial execution of these transactions, in which global transactions are executed sequentially. The quasi-serialization sequence is given by the equivalent quasi-serial execution sequence and relates exclusively to global transactions.

Example 12-5

The execution of transactions shown in Fig. 12-6 cannot be serialized because the GT1 2 and in LDBS2 the dependencies GT2 3 1 exist. However, there is quasi-serializability, since the local schedules can be serialized and the schedule in LDBS2 is equivalent to the following execution sequence for the GST12 before GST22 is performed:
w3 (b), r1 (b), w2 (c), r3 (c).

The subscription 3 relates to the local transaction LT3, the subscripts 1 or 2 to the global sub-transactions GST12 or GST22.

Thus, the shown execution of the two global transactions is equivalent to a (quasi) serial execution, in the case of the GT1 before GT2 is performed. The indirect conflict in LDBS2 with the local transaction LT3 is therefore irrelevant for the quasi-serializability, since it was possible to find an equivalent execution sequence in which all operations of GT1 before those of GT2 were executed.

However, this is not the case in the example in Fig. 12-5, since there relevant dependencies with the local transaction occur in LDBS2, which causes a real interaction between the global sub-transactions. Because the changes from GST22 are there indirectly for the sub-transaction GST via the local transaction12 visible, so that no equivalent reversal of their order of execution is possible.

A simple approach to guaranteeing quasi-serializability is to always execute global transactions serially, i.e. to allow at most one global transaction at a time. However, this is obviously unacceptable for performance reasons, since very long delay times are to be expected when starting global transactions. In [DEK91] a less restrictive implementation approach was proposed, which takes into account which databases the global transactions access. In particular, global transactions that operate on at most one local database can be allowed simultaneously without violating the quasi-serializability [56]. Global transactions that want to access all local databases must, however, continue to be carried out serially (the quasi-serializable process shown in Fig. 12-6 is therefore not permitted). The approach does not require any LDBS extensions and enables different local synchronization methods as long as these ensure local serializability. Furthermore, no global deadlocks can occur. On the other hand, significant blocking times are still possible for global transactions until they are approved. In addition, a centralized global transaction management is required, via which all global transactions can be started and ended. After all, the approach only deals with the isolation aspect and not the atomicity and durability of global transactions (commit treatment).
An overlapping execution of global transactions is limited to one node, so that the local serialization order determines the quasi-serialization order.