Database Internals: ACID transactions
This article deep dive into what are ACID transactions and why are they relevant in today's world
Hello readers, I hope you’re doing well!
In this edition of the newsletter, let’s discuss the following topics:
1. What is a transaction?
2. What are ACID transactions?
3. ACID transactions implementation
4. Importance of ACID transactions
5. Conclusion
What is a transaction?
As per Wikipedia, a database transaction represents a single unit of work that can be performed against a database. This single unit of work is generally a group of read-and-write operations that may affect single or multiple rows of a table.
For example:
The above transaction will update the employee_category to ‘Leadership‘ for all the records of the table ‘employees‘ where the employee level >= 5.
What are ACID transactions?
ACID is a short form for 4 properties: Atomicity, Consistency, Isolation, and Durability. Any database transaction that has these properties is called an ACID transaction. The databases that support ACID transactions are called Transactional Databases.
Now, let’s deep dive to understand each of the properties.
Atomicity: This property means that all the operations within a transaction are a single unit which means either the transaction that has all the operations will be performed successfully or all of them fail together. This property helps maintain data integrity of the database.
For example: In the image attached below, atomicity ensures that either the debit and credit take place together successfully or there is no change.
This property is super useful in case of unwanted events like power crashes or outages. Suppose, the money is debited from the source account(Account A) as highlighted in the above image but just then there is a power outage before it’s credited to the destination account(Account B). In this case, the transaction changes made are discarded bringing the database back to its original state.
Consistency: This property ensures that the changes within a database transaction should be always consistent with the database constraints or triggers. This property essentially helps in preventing invalid data entry by defining rules for data integrity (e.g., foreign keys, unique values) at the table level.
Suppose there is a constraint in the schema that the user balance should be a positive integer. Now, if a transaction tries to debit some money from a user’s account making the balance below zero, then the consistency of the ACID transaction will be violated and the transaction will fail.
Isolation: This property ensures that all the transactions run in an isolated environment. This means transactions don’t interfere with each other and thus enables databases to have concurrent transactions.
Suppose, a user has a balance of 100$. If there are two transactions concurrently running trying to debit 50$ each, then the final user balance will be 0$. Since we are able to produce the expected result by running transactions concurrently, this enables our database to have higher throughput.
Note that there are different types of isolation levels available when working with concurrent transactions but I’ll save that for later.
Durability: This property ensures that the changes made in a transaction persist even in case of system failures or crashes unless the hard-disk on which data gets stored is destroyed.
We can ensure data durability using the following ways:
Redo Log: The changes that are committed to the database are not written in real time to the data pages or data indexes on the hard disk. It’s actually stored in an append-only Redo Log (or Write Ahead Log) that stores the transaction statements. In case of a power outage just after the transaction is ingested but before the transaction changes are committed to the disk, we can use the Redo Log to reattempt the transaction changes into the hard disk.
Replication: In distributed databases, we can use replication of data across multiple database nodes to ensure data durability.
ACID Transactions Implementation
As per Wikipedia, a SQL database implements ACID transactions using the following sequence of steps:
Begin the transaction: A BEGIN TRANSACTION statement declaration signifies the starting point of a transaction also known as the savepoint.
Execute operations: All operations within the transaction are executed one at a time. The database will validate each operation to make sure it adheres to the constraints set in the schema.
Commit or Rollback: After all the operations are performed successfully, a COMMIT statement is used for committing the transaction, otherwise, ROLLBACK is used for rolling back up to the savepoint established in the beginning.
Please note that the implementation of ACID transactions might vary on the database you are referring to but the base idea as explained above remains the same.
Importance of ACID transactions
ACID transactions provide the highest level of data integrity and reliability in a database. This ensures that the data is always in a consistent state. If there is a database that does not support ACID transactions and there is a power outage in the middle of a transaction, then it’s possible that only some of your data is saved. This leads to an inconsistent database that is time-consuming to recover from.
One of the prominent use cases of ACID transactions is Banking related transactions which require money movement from the source account to the destination account. If the money is debited, then it should get credited somewhere otherwise there should be no money movement at all.
PS: Please do not confuse that ACID transactions are supported with SQL(or Relational databases) only. As a matter of fact, MongoDB (a No-SQL database) also supports ACID transactions. As the scale of your service grows, it’s tough to hold ACID-compliant transactions in a distributed database because they are implemented using locks, and guess what, locks are expensive to use. Hence, people generally tend to move towards No-SQL databases that provide BASE(Basically Available, Soft State, Eventually Consistent) properties.
Is there any database that provides ACID compliant transactions in a distributed SQL storage? Yes, YugabyteDB!
Conclusion
ACID transactions is a concept that is in general used for transactional databases. The transactional databases are called “transactional“ due to the support of ACID transactions and not because of the type of database they are(SQL or No-SQL). While it’s important to understand what ACID transactions are, deciding which database is suitable for you depends on your use case on how tight you want the constraints of a transaction.
That’s it, folks for this edition of the newsletter. See you in the next one! If you feel I’m doing a decent job, help me reach more Engineers by sharing this free newsletter!
Resources
Databricks - ACID Transactions
MongoDB - ACID Transactions
Redis - ACID Transactions
Wikipedia - Database Transaction
Wikipedia - ACID