Persistence engines â and databases in particular â have some features for fighting data inconsistencies: ACID, constraints, referential integrity, locking, concurrency controls, and transactions. Let's review these concepts before working with Aggregates.
Most of these concepts are on the Internet and available to the public. We want to thank the people at Oracle, PostgreSQL, and Doctrine for doing amazing work with their documentation. They have carefully defined and explained these important terms, and rather than reinvent the wheel, we've compiled some of these official explanations to share with you.
As discussed in a previous section, ACID stands for atomicity, consistency, isolation, and durability. According to the MySQL Glossary:
These properties are all desirable in a database system, and are all closely tied to the notion of a transaction. For example, the transactional features of MySQL InnoDB engine adhere to the ACID principles.
Transactions are atomic units of work that can be committed or rolled back. When a transaction makes multiple changes to the database, either all the changes succeed when the transaction is committed, or all the changes are undone when the transaction is rolled back.
The database remains in a consistent state at all times, after each commit or rollback, and while transactions are in progress. If related data is being updated across multiple tables, queries see either all old values or all new values, not a mix of old and new values.
Transactions are protected isolated from each other while they are in progress. They cannot interfere with each other or see each other's uncommitted data. This isolation is achieved through the locking mechanism. Experienced users can adjust the isolation level, trading off less protection in favor of increased performance and concurrency, when they can be sure that the transactions really do not interfere with each other.
The results of transactions are durable: once a commit operation succeeds, the changes made by that transaction are safe from power failures, system crashes, race conditions, or other potential dangers that many non-database applications are vulnerable to. Durability typically involves writing to disk storage, with a certain amount of redundancy to protect against power failures or software crashes during write operations.
According to the PostgreSQL 8.2.23 Documentation:
Transactions are a fundamental concept of all database systems. The essential point of a transaction is that it bundles multiple steps into a single, all-or-nothing operation. The intermediate states between the steps are not visible to other concurrent transactions, and if some failure occurs that prevents the transaction from completing, then none of the steps affect the database at all.
For example, consider a bank database that contains balances for various customer accounts, as well as total deposit balances for branches. Suppose that we want to record a payment of $100.00 from Alice's account to Bob's account. Simplifying outrageously, the SQL commands for this might look like:
UPDATE accounts
SET balance = balance - 100.00
WHERE name = 'Alice';
UPDATE branches
SET balance = balance - 100.00
WHERE name = (SELECT branch_name FROM accounts WHERE name ='Alice');
UPDATE accounts
SET balance = balance + 100.00
WHERE name = 'Bob';
UPDATE branches
SET balance = balance + 100.00
WHERE name = (SELECT branch_name FROM accounts WHERE name ='Bob');
The details of these commands are not important here. The important point is that there are several separate updates involved to accomplish this rather simple operation. Our bank's officers will want to be assured that either all these updates happen, or none of them happen. It would certainly not do for a system failure to result in Bob receiving $100.00 that was not debited from Alice. Nor would Alice long remain a happy customer if she was debited without Bob being credited. We need a guarantee that if something goes wrong partway through the operation, none of the steps executed so far will take effect. Grouping the updates into a transaction gives us this guarantee. A transaction is said to be atomic: from the point of view of other transactions, it either happens completely or not at all.
We also want a guarantee that once a transaction is completed and acknowledged by the database system, it has indeed been permanently recorded and won't be lost even if a crash ensues shortly thereafter. For example, if we are recording a cash withdrawal by Bob, we do not want any chance that the debit to his account will disappear in a crash just after he walks out the bank door. A transactional database guarantees that all the updates made by a transaction are logged in permanent storage (That is: on disk) before the transaction is reported complete.
Another important property of transactional databases is closely related to the notion of atomic updates: when multiple transactions are running concurrently, each one should not be able to see the incomplete changes made by others. For example, if one transaction is busy totalling all the branch balances, it would not do for it to include the debit from Alice's branch but not the credit to Bob's branch, nor vice versa. So transactions must be all-or-nothing not only in terms of their permanent effect on the database, but also in terms of their visibility as they happen. The updates made so far by an open transaction are invisible to other transactions until the transaction completes, whereupon all the updates become visible simultaneously.
In PostgreSQL, for example, a transaction is set up by surrounding the SQL commands of the transaction with BEGIN and COMMIT commands. So our banking transaction would actually look like:
BEGIN;
UPDATE accou...