Recovery with Concurrent Transactions in DBMS - Recovery with Concurrent Transactions in SQL

Above two methods hold good if there is single transaction like updating the address or so. But what will happen when there are multiple transactions which occur concurrently? Same method of logging the logs can be followed. But since there are a concurrent transactions, order and time of each transaction makes a great difference. Failing to maintain the order of transaction will lead to wrong data while recovering. Also, transactions may have number of steps. Maintaining the log for each step will increase the log file size. Again it will become an overhead to maintain a log file along with these transactions. In addition performing redo operation is also an overhead because it is executing the executed transaction again and again. It is not actually necessary. So our goal here should be small log file with easy recovery of data in case of failure. To handle this situation Checkpoints are introduced during the transaction.

Checkpoint acts like a bookmark. During the execution of transaction, such checkpoints are marked and transaction is executed. The log files will be created as usual with the steps of transactions. When it reaches the checkpoint, the transaction will be updated into database and all the logs till that point will be removed from file. Log files then are updated with new steps of transaction till next checkpoint and so on. Here care should be taken to create a checkpoint because, if any checkpoints are created before any transaction is complete fully, and data is updated to database, it will not meet the purpose of the log file and checkpoint. If checkpoints are created when each transaction is complete or where the database is at consistent state, then it will be useful.

Suppose there are 4 concurrent transactions – T1, T2, T3 and T4. A checkpoint is added at the middle of T1 and there is failure while executing T4. Let us see how a recovery system recovers the database from this failure.

It starts reading the log files from the end to start, so that it can reverse the transactions. i.e.; it reads log files from transaction T4 to T1.

Recovery system always maintains undo log and redo log. The log entries in the undo log will be used to undo the transactions where as entries in the redo list will be re executed. The transactions are put into redo list if it reads the log files with (<Tn, Start>, <Tn, Commit>) or <Tn , Commit>. That means, it lists all the transactions that are fully complete into redo list to re execute after the recovery. In above example, transactions T2 andT3 will have (<Tn, Start>, <Tn, Commit>) in the log file. The transaction T1 will have only <Tn, Commit> in the log file. This because, the transaction is committed after the checkpoint is crossed. Hence all the logs with<Tn, Start>, are already written to the database and log file is removed for those steps. Hence it puts T1, T2 and T3 into redo list.

The logs with only <Tn, Start> are put into undo list because they are not complete and can lead to inconsistent state of DB. In above example T4 will be put into undo list since this transaction is not yet complete and failed amid.

This is how a DBMS recovers the data incase concurrent transaction failure.