#+SETUPFILE: ../../config.org #+TAGS: Databases #+OPTIONS: toc:2 #+CATEGORY: Engineering #+DATE: 2019-04-26 #+TITLE: Transaction Variants #+DESC: A quick overview of different transaction variants in a RDBM In this post I want to talk about different variants of [[https://en.wikipedia.org/wiki/Database_transaction][database transactions]], I assume you already know about database transactions. So, let's cut to the chase. * Flat Transaction :Databases: Flat transaction are those regular transactions we know about and are the most common transactions in the [[https://en.wikipedia.org/wiki/Database#Database_management_system][DBMS]]s Flat transactions are simple but they can not address two problems: - Multi stage transactions For example, Let's say we want to book flight from City C1 to C2. Since there is no direct fly we have to book 4 flights from, =C1 -> CA -> CB -> C2=. The process of booking each of these flights is a transaction by itself and the whole process is a transaction too. - Bulk updates Let's say we want to update billion tuples. What if the very last tuple fails to update and cause the transaction to abort. Then we need to revert the changes made by the transaction and revert a billion tuples which obviously is a huge task. * Transaction Savepoints These transactions are similar to flat transaction with addition of one extra thing which is save points. So any where in there transaction users case ask for a save point and again they can rollback to a save point or rollback the entire transaction. #+BEGIN_SRC sql BEGIN READ(A) WRITE(A) SAVEPOINT 1 WRITE(B) SAVEPOINT 2 ROLLBACK TO 1 COMMIT #+END_SRC #+BEGIN_QUOTE Note: *These transactions only solve the multi stage transaction problem.* #+END_QUOTE * Nested transactions Nested transactions are similar to save points transactions, but instead save points these transactions break down to smaller flat transactions. Each transaction commits separately from other transactions. But result of the parent transaction rule them all, so if the parent transaction fails all the nested transactions have to rollback. #+BEGIN_SRC sql BEGIN BEGIN T1 ... COMMIT T1 BEGIN T2 ... COMMIT T2 COMMIT #+END_SRC #+BEGIN_QUOTE Note: *These transactions only solve the multi stage transaction problem.* #+END_QUOTE * Chained transactions In these kind of transactions, smaller flat transaction can be applied in a chain in the way that the result of each of them is not visible to the outside world until the end of the chain. In theory chained transactions should be applied in sequence but in practice in some cases we can interleave their operations. Also Between T1 and T2 of a chained transaction, No other thread of code shouldn't be able to make changes to those resources which T2 will operates on. If any transaction in the chain fails, it has nothing to do with the previous transactions in the chain. For example: #+BEGIN_SRC T1 -> T2 -> T3 S -> S -> F #+END_SRC In the chained transaction above only T3 failed and T1 and T2 are successfully committed to storage. We don't have to roll them back. While chained transactions can break big transactions into smaller pieces for better parallelism but *they only solve the multi stage transaction problem* Not the bulk update problem. * Compensating transactions This type of transactions are special transactions which are designed to semantically reverse the effect of another transaction which already committed. One important thing to remember about compensating transactions is that they know how to revert the /logical/ effect of other transactions /NOT the physical/ effect. For example, If a transaction increase a counter by one. The physical revert would be to changes the binary data that inserted for that counter to what it was before the transaction, but the logical revert would be to decrease the counter by one when needed. So basically these kind of transactions know about how to put the database in correct state before the other transaction. * Saga Transactions A saga transaction is a sequence of chained transactions =T1 - Tn= and compensating transaction =C1 - C(n-1)= where the following guaranteed: - The transactions will commit in the order =T1...Tj=, =Cj...C1= (where =j < n=). So basically this means that a saga transaction is a seq of chained transactions which applies the smaller transactions in order with their corresponding compensating transactions. In a chained transaction when ever transaction Tn aborts, the transactions before Tn stay committed, but in saga transactions they will be rollback using compensating transactions that know how to roll them back *logically*. So Saga transactions can be fix both multi-staging and bulk update problems. But the issue here is that the compensating transactions are something that requires application level understanding of the use case so most of the time they are implemented in the application frameworks instead of DBMSs.